NOD nucleic acids and polypeptides

ABSTRACT

The present invention relates to the NOD proteins and nucleic acids encoding the NOD proteins. The present invention further provides assays for the detection of NOD polymorphisms and mutations associated with disease states, as well as methods of screening for ligands and modulators of NOD proteins.

This application claims priority to provisional patent application Ser.No. 60/452,274, filed Mar. 05, 2004; which is herein incorporated byreference in its entirety.

This invention was made with government support under Grants No. DK61707and GM60421 awarded by the National Institutes of Health. The Governmenthas certain rights in the invention.

FIELD OF THE INVENTION

The present invention relates to the NOD proteins and nucleic acidsencoding the NOD proteins. The present invention further provides assaysfor the detection of NOD polymorphisms and mutations associated withdisease states, as well as methods of screening for ligands andmodulators of NOD proteins.

BACKGROUND OF THE INVENTION

The removal of infectious agents by the host is fundamental for thesurvival of multicellular organisms. In animals and plants, the initialdetection of microbial agents relies on specialized host receptors thatrecognize molecules expressed exclusively by microbes (Dang and Jones,Nature 411, 826-833 (2001); Medzhitov, Nature Rev. Immunol. 1, 135-145(2001)). In animals, detection of microbial agents is mediated by therecognition of pathogen-associated molecular patterns (PAMPs) byspecific host pattern-recognition receptors (PRRs) (Medzhitov, supra).Because the structure of each PAMP is highly conserved and invariant inmicroorganisms of the same class, the animal can recognize most or allmicrobes with a limited number of PRRs. The identification andcharacterization of plasma membrane Toll-like receptors (TLRs) as PRRshave provided fundamental insight into the mechanisms of host defense inanimals. There is now compelling evidence that TLRs play a pivotal rolein mediating immune responses to bacterial pathogens (Medzhitov, supra;Akira et al., Nat. Immunol. 2, 675-680 (2001)) In mammals, TLRs mediatehost immune responses by inducing the secretion of severalproinflammatory cytokines and co-stimulatory surface molecules throughthe activation of transcriptional factors including NF-κB (Medzhitov,supra; Akira et al., supra). The

SUMMARY OF THE INVENTION

The present invention relates to the NOD proteins and nucleic acidsencoding the NOD proteins. The present invention further provides assaysfor the detection of NOD polymorphisms and mutations associated withdisease states, as well as methods of screening for ligands andmodulators of NOD proteins.

Accordingly, in some embodiments, the present invention provides acomposition comprising an isolated and purified nucleic acid sequenceencoding a protein selected from the group consisting of SEQ ID NOs:12-22. In some embodiments, the sequence is operably linked to aheterologous promoter. In some embodiments, the sequence is containedwithin a vector. In some embodiments, the vector is within a host cell.In some embodiments, the nucleic acid comprises one of SEQ ID NOs: 1 andvariants thereof that are at least 80%, preferably at least 90%, andeven more preferably at least 95% identical to SEQ ID NOs: 12-22. Insome embodiments, the nucleic acid comprises one of SEQ ID NOs: 1-11.

The present invention further provides a composition comprising apolypeptide having an amino acid sequence comprising SEQ ID NOs: 12-22or variants thereof that are at least 80% identical to SEQ ID NOs:12-22. In some embodiments, the polypeptide is at least 90%, andpreferably at least 95% identical to SEQ ID NOs: 12-22. In someembodiments, the polypeptide comprises one of SEQ ID NOs: 12-22.

The present invention additionally provides a method of generating aninflammation profile, comprising providing a sample from a subject,wherein the sample comprises nucleic acid; and detecting the presence orabsence of expression of at least two NOD genes in the sample, therebygenerating an inflammation profile. In some embodiments, the detectingcomprises detecting the presence or absence of expression of at least 5,and preferably at least 10 NOD genes in said sample. In someembodiments, the nucleic acid comprises genomic DNA. In otherembodiments, the nucleic acid comprises mRNA.

DESCRIPTION OF THE FIGURES

FIG. 1 shows the domain structures of exemplary NOD nucleic acids andproteins of some embodiments of the present invention. CARD,caspase-recruitment domain; DC, dendritic cell; DT, DEFCAP/TUCANexpanded homology domain; EBD, effector-binding domain; NOD,nucleotide-binding oligomerization domain; PYD, pyrin domain; LRR,leucine-rich repeat; WD40R, WD40 repeat; BIR, baculoviralinhibitor-of-apoptosis repeat; TIR, Toll/interleukin-1 receptor.

FIG. 2 shows an induced proximity model of NOD protein activation. EBD,effector binding domain; LRD, ligand recognition domain; NOD,Nucleotide-binding oligomerization domain.

FIG. 3 shows signaling pathways mediated by NOD1, NOD2, IPAF andCryopyrin.

FIG. 4 shows a model for the role of NOD 1, NOD2 and related NODs ininnate and adaptive immunity. APC, antigen-presenting cell; MHC-II,major histocompatibility complex class II molecules; TCR, T-cellreceptor; TLR, Toll-like receptors.

FIG. 5 shows hypothetical mechanisms of disease in patients withmutations in NOD2, Cryopyrin, CIITA and Pyrin.

FIG. 6 shows Table 2.

FIG. 7 shows the nucleic acid sequence of NOD3 (SEQ ID NO: 1).

FIG. 8 shows the nucleic acid sequence of NOD5 (SEQ ID NO:2).

FIG. 9 shows the nucleic acid sequence of NOD6 (SEQ ID NO:3).

FIG. 10 shows the nucleic acid sequence of NOD8 (SEQ ID NO:4).

FIG. 11 shows the nucleic acid sequence of NOD9 (SEQ ID NO:5).

FIG. 12 shows the nucleic acid sequence of NOD12 (SEQ ID NO:6).

FIG. 13 shows the nucleic acid sequence of NOD14 (SEQ ID NO:7).

FIG. 14 shows the nucleic acid sequence of NOD17 (SEQ ID NO:9).

FIG. 15 shows the nucleic acid sequence of NOD26 (SEQ ID NO: 10).

FIG. 16 shows the nucleic acid sequence of NOD27 (SEQ ID NO: 11).

FIG. 17 shows the amino acid sequence of NOD3 (SEQ ID NO:12).

FIG. 18 shows the amino acid sequence of NOD5 (SEQ ID NO: 13).

FIG. 19 shows the amino acid sequence of NOD6 (SEQ ID NO:14).

FIG. 20 shows the amino acid sequence of NOD8 (SEQ ID NO: 15).

FIG. 21 shows the amino acid sequence of NOD9 (SEQ ID NO:16).

FIG. 22 shows the amino acid sequence of NOD12 (SEQ ID NO: 17).

FIG. 23 shows the amino acid sequence of NOD14 (SEQ ID NO:18).

FIG. 24 shows the amino acid sequence of NOD17 (SEQ ID NO:20).

FIG. 25 shows the amino acid sequence of NOD26 (SEQ ID NO:21).

FIG. 26 shows the amino acid sequence of NOD27 (SEQ ID NO:22).

FIG. 27 shows the nucleic acid sequence of NOD16 (SEQ ID NO:8).

FIG. 28 shows the nucleic acid sequence of NOD 16 (SEQ ID NO: 19).

DEFINITIONS

To facilitate understanding of the invention, a number of terms aredefined below.

As used herein, the term “NOD” when used in reference to a protein ornucleic acid refers to a NOD protein or nucleic acid encoding a NODprotein of the present invention. The term NOD encompasses both proteinsthat are identical to wild-type NODs and those that are derived fromwild type NOD (e.g., variants of NOD polypeptides of the presentinvention) or chimeric genes constructed with portions of NOD codingregions). In some embodiments, the “NOD” is a wild type NOD nucleic acid(SEQ ID NOs: 1 -11) or amino acid (SEQ ID NOs: 12-22) sequence. In otherembodiments, the “NOD” is a variant or mutant.

As used herein, the term “instructions for using said kit for saiddetecting the presence or absence of a variant NOD nucleic acid orpolypeptide in said biological sample” includes instructions for usingthe reagents contained in the kit for the detection of variant and wildtype NOD nucleic acids or polypeptides. In some embodiments, theinstructions further comprise the statement of intended use required bythe U.S. Food and Drug Administration (FDA) in labeling in vitrodiagnostic products. The FDA classifies in vitro diagnostics as medicaldevices and requires that they be approved through the 510(k) procedure.Information required in an application under 510(k) includes: 1) The invitro diagnostic product name, including the trade or proprietary name,the common or usual name, and the classification name of the device; 2)The intended use of the product; 3) The establishment registrationnumber, if applicable, of the owner or operator submitting the 510(k)submission; the class in which the in vitro diagnostic product wasplaced under section 513 of the FD&C Act, if known, its appropriatepanel, or, if the owner or operator determines that the device has notbeen classified under such section, a statement of that determinationand the basis for the determination that the in vitro diagnostic productis not so classified; 4) Proposed labels, labeling and advertisementssufficient to describe the in vitro diagnostic product, its intendeduse, and directions for use. Where applicable, photographs orengineering drawings should be supplied; 5) A statement indicating thatthe device is similar to and/or different from other in vitro diagnosticproducts of comparable type in commercial distribution in the U.S.,accompanied by data to support the statement; 6) A 510(k) summary of thesafety and effectiveness data upon which the substantial equivalencedetermination is based; or a statement that the 510(k) safety andeffectiveness information supporting the FDA finding of substantialequivalence will be made available to any person within 30 days of awritten request; 7) A statement that the submitter believes, to the bestof their knowledge, that all data and information submitted in thepremarket notification are truthful and accurate and that no materialfact has been omitted; 8) Any additional information regarding the invitro diagnostic product requested that is necessary for the FDA to makea substantial equivalency determination. Additional information isavailable at the Internet web page of the U.S. FDA.

As used herein, the term “inflammation profile” refers to the pattern ofexpression of two or more NOD genes of the present invention (e.g., theNOD genes described by SEQ ID NOs: 1-11). In some embodiments, thepattern of expression comprises the presence or absence of expression.In other embodiments, the pattern of expression comprises the level ofexpression or localization of expression of the NOD genes. Theinflammation profiles of the present invention find use thecharacterization of inflammatory diseases and in determining a subject'srisk of contacting an inflammatory disease. For example, in someembodiments, inflammation profiles from a subject are compared tocontrol profiles associated with disease or predisposition to disease.

The term “gene” refers to a nucleic acid (e.g., DNA) sequence thatcomprises coding sequences necessary for the production of apolypeptide, RNA (e.g., including but not limited to, mRNA, tRNA andrRNA) or precursor (e.g., NOD). The polypeptide, RNA, or precursor canbe encoded by a full length coding sequence or by any portion of thecoding sequence so long as the desired activity or functional properties(e.g., enzymatic activity, ligand binding, signal transduction, etc.) ofthe full-length or fragment are retained. The term also encompasses thecoding region of a structural gene and the sequences located adjacent tothe coding region on both the 5′ and 3′ ends for a distance of about 1kb on either end such that the gene corresponds to the length of thefull-length mRNA. The sequences that are located 5′ of the coding regionand which are present on the mRNA are referred to as 5′ untranslatedsequences. The sequences that are located 3′ or downstream of the codingregion and that are present on the mRNA are referred to as 3′untranslated sequences. The term “gene” encompasses both cDNA andgenomic forms of a gene. A genomic form or clone of a gene contains thecoding region interrupted with non-coding sequences termed “introns” or“intervening regions” or “intervening sequences.” Introns are segmentsof a gene that are transcribed into nuclear RNA (hnRNA); introns maycontain regulatory elements such as enhancers. Introns are removed or“spliced out” from the nuclear or primary transcript; introns thereforeare absent in the messenger RNA (mRNA) transcript. The mRNA functionsduring translation to specify the sequence or order of amino acids in anascent polypeptide.

In particular, the term “NOD gene” or “NOD genes” refers to thefull-length NOD nucleotide sequence (e.g., contained in SEQ ID NOs:1-11). However, it is also intended that the term encompass fragments ofthe NOD sequences, mutants of the NOD sequences, as well as otherdomains within the full-length NOD nucleotide sequences. Furthermore,the terms “NOD nucleotide sequence” or “NOD polynucleotide sequence”encompasses DNA, cDNA, and RNA (e.g., mRNA) sequences.

Where “amino acid sequence” is recited herein to refer to an amino acidsequence of a naturally occurring protein molecule, “amino acidsequence” and like terms, such as “polypeptide” or “protein” are notmeant to limit the amino acid sequence to the complete, native aminoacid sequence associated with the recited protein molecule.

In addition to containing introns, genomic forms of a gene may alsoinclude sequences located on both the 5′ and 3′ end of the sequencesthat are present on the RNA transcript. These sequences are referred toas “flanking” sequences or regions (these flanking sequences are located5′ or 3′ to the non-translated sequences present on the mRNAtranscript). The 5′ flanking region may contain regulatory sequencessuch as promoters and enhancers that control or influence thetranscription of the gene. The 3′ flanking region may contain sequencesthat direct the termination of transcription, post-transcriptionalcleavage and polyadenylation.

The term “wild-type” refers to a gene or gene product that has thecharacteristics of that gene or gene product when isolated from anaturally occurring source. A wild-type gene is that which is mostfrequently observed in a population and is thus arbitrarily designed the“normal” or “wild-type” form of the gene. In contrast, the terms“modified,” “mutant,” “polymorphism,” and “variant” refer to a gene orgene product that displays modifications in sequence and/or functionalproperties (i.e., altered characteristics) when compared to thewild-type gene or gene product. It is noted that naturally-occurringmutants can be isolated; these are identified by the fact that they havealtered characteristics when compared to the wild-type gene or geneproduct.

As used herein, the terms “nucleic acid molecule encoding,” “DNAsequence encoding,” and “DNA encoding” refer to the order or sequence ofdeoxyribonucleotides along a strand of deoxyribonucleic acid. The orderof these deoxyribonucleotides determines the order of amino acids alongthe polypeptide (protein) chain. The DNA sequence thus codes for theamino acid sequence.

DNA molecules are said to have “5′ ends” and “3′ ends” becausemononucleotides are reacted to make oligonucleotides or polynucleotidesin a manner such that the 5′ phosphate of one mononucleotide pentosering is attached to the 3′ oxygen of its neighbor in one direction via aphosphodiester linkage. Therefore, an end of an oligonucleotides orpolynucleotide, referred to as the “5′ end” if its 5′ phosphate is notlinked to the 3′ oxygen of a mononucleotide pentose ring and as the “3′end” if its 3′ oxygen is not linked to a 5′ phosphate of a subsequentmononucleotide pentose ring. As used herein, a nucleic acid sequence,even if internal to a larger oligonucleotide or polynucleotide, also maybe said to have 5′ and 3′ ends. In either a linear or circular DNAmolecule, discrete elements are referred to as being “upstream” or 5′ ofthe “downstream” or 3′ elements. This terminology reflects the fact thattranscription proceeds in a 5′ to 3′ fashion along the DNA strand. Thepromoter and enhancer elements that direct transcription of a linkedgene are generally located 5′ or upstream of the coding region. However,enhancer elements can exert their effect even when located 3′ of thepromoter element and the coding region. Transcription termination andpolyadenylation signals are located 3′ or downstream of the codingregion.

As used herein, the terms “an oligonucleotide having a nucleotidesequence encoding a gene” and “polynucleotide having a nucleotidesequence encoding a gene,” means a nucleic acid sequence comprising thecoding region of a gene or, in other words, the nucleic acid sequencethat encodes a gene product. The coding region may be present in a cDNA,genomic DNA, or RNA form. When present in a DNA form, theoligonucleotide or polynucleotide may be single-stranded (i.e., thesense strand) or double-stranded. Suitable control elements such asenhancers/promoters, splice junctions, polyadenylation signals, etc. maybe placed in close proximity to the coding region of the gene if neededto permit proper initiation of transcription and/or correct processingof the primary RNA transcript. Alternatively, the coding region utilizedin the expression vectors of the present invention may containendogenous enhancers/promoters, splice junctions, intervening sequences,polyadenylation signals, etc. or a combination of both endogenous andexogenous control elements.

As used herein, the term “regulatory element” refers to a geneticelement that controls some aspect of the expression of nucleic acidsequences. For example, a promoter is a regulatory element thatfacilitates the initiation of transcription of an operably linked codingregion. Other regulatory elements include splicing signals,polyadenylation signals, termination signals, etc.

As used herein, the terms “complementary” or “complementarity” are usedin reference to polynucleotides (i.e., a sequence of nucleotides)related by the base-pairing rules. For example, for the sequence5′-“A-G-T-3′,” is complementary to the sequence 3′-“T-C-A-5′.”Complementarity may be “partial,” in which only some of the nucleicacids' bases are matched according to the base pairing rules. Or, theremay be “complete” or “total” complementarity between the nucleic acids.The degree of complementarity between nucleic acid strands hassignificant effects on the efficiency and strength of hybridizationbetween nucleic acid strands. This is of particular importance inamplification reactions, as well as detection methods that depend uponbinding between nucleic acids. Complementarity can include the formationof base pairs between any type of nucleotides, including non-naturalbases, modified bases, synthetic bases and the like.

The term “homology” refers to a degree of complementarity. There may bepartial homology or complete homology (i.e., identity). A partiallycomplementary sequence is one that at least partially inhibits acompletely complementary sequence from hybridizing to a target nucleicacid and is referred to using the functional term “substantiallyhomologous.” The term “inhibition of binding,” when used in reference tonucleic acid binding, refers to inhibition of binding caused bycompetition of homologous sequences for binding to a target sequence.The inhibition of hybridization of the completely complementary sequenceto the target sequence may be examined using a hybridization assay(Southern or Northern blot, solution hybridization and the like) underconditions of low stringency. A substantially homologous sequence orprobe will compete for and inhibit the binding (i.e., the hybridization)of a completely homologous to a target under conditions of lowstringency. This is not to say that conditions of low stringency aresuch that non-specific binding is permitted; low stringency conditionsrequire that the binding of two sequences to one another be a specific(i.e., selective) interaction. The absence of non-specific binding maybe tested by the use of a second target that lacks even a partial degreeof complementarity (e.g., less than about 30% identity); in the absenceof non-specific binding the probe will not hybridize to the secondnon-complementary target.

The art knows well that numerous equivalent conditions may be employedto comprise low stringency conditions; factors such as the length andnature (DNA, RNA, base composition) of the probe and nature of thetarget (DNA, RNA, base composition, present in solution or immobilized,etc.) and the concentration of the salts and other components (e.g., thepresence or absence of formamide, dextran sulfate, polyethylene glycol)are considered and the hybridization solution may be varied to generateconditions of low stringency hybridization different from, butequivalent to, the above listed conditions. In addition, the art knowsconditions that promote hybridization under conditions of highstringency (e.g., increasing the temperature of the hybridization and/orwash steps, the use of formamide in the hybridization solution, etc.).

When used in reference to a double-stranded nucleic acid sequence suchas a cDNA or genomic clone, the term “substantially homologous” refersto any probe that can hybridize to either or both strands of thedouble-stranded nucleic acid sequence under conditions of low stringencyas described above.

A gene may produce multiple RNA species that are generated bydifferential splicing of the primary RNA transcript. cDNAs that aresplice variants of the same gene will contain regions of sequenceidentity or complete homology (representing the presence of the sameexon or portion of the same exon on both cDNAs) and regions of completenon-identity (for example, representing the presence of exon “A” on cDNA1 wherein cDNA 2 contains exon “B” instead). Because the two cDNAscontain regions of sequence identity they will both hybridize to a probederived from the entire gene or portions of the gene containingsequences found on both cDNAs; the two splice variants are thereforesubstantially homologous to such a probe and to each other.

When used in reference to a single-stranded nucleic acid sequence, theterm “substantially homologous” refers to any probe that can hybridize(i.e., it is the complement of) the single-stranded nucleic acidsequence under conditions of low stringency as described above.

As used herein, the term “competes for binding” is used in reference toa first polypeptide with an activity which binds to the same substrateas does a second polypeptide with an activity, where the secondpolypeptide is a variant of the first polypeptide or a related ordissimilar polypeptide. The efficiency (e.g., kinetics orthermodynamics) of binding by the first polypeptide may be the same asor greater than or less than the efficiency substrate binding by thesecond polypeptide. For example, the equilibrium binding constant(K_(D)) for binding to the substrate may be different for the twopolypeptides. The term “K_(m)” as used herein refers to theMichaelis-Menton constant for an enzyme and is defined as theconcentration of the specific substrate at which a given enzyme yieldsone-half its maximum velocity in an enzyme catalyzed reaction.

As used herein, the term “hybridization” is used in reference to thepairing of complementary nucleic acids. Hybridization and the strengthof hybridization (i.e., the strength of the association between thenucleic acids) is impacted by such factors as the degree ofcomplementary between the nucleic acids, stringency of the conditionsinvolved, the T_(m) of the formed hybrid, and the G:C ratio within thenucleic acids.

As used herein, the term “T_(m)” is used in reference to the “meltingtemperature.” The melting temperature is the temperature at which apopulation of double-stranded nucleic acid molecules becomes halfdissociated into single strands. The equation for calculating the T_(m)of nucleic acids is well known in the art. As indicated by standardreferences, a simple estimate of the T_(m) value may be calculated bythe equation: T_(m)=81.5+0.41 (% G+C), when a nucleic acid is in aqueoussolution at 1 M NaCl (See e.g., Anderson and Young, Quantitative FilterHybridization, in Nucleic Acid Hybridization [1985]). Other referencesinclude more sophisticated computations that take structural as well assequence characteristics into account for the calculation of T_(m).

As used herein the term “stringency” is used in reference to theconditions of temperature, ionic strength, and the presence of othercompounds such as organic solvents, under which nucleic acidhybridizations are conducted. Those skilled in the art will recognizethat “stringency” conditions may be altered by varying the parametersjust described either individually or in concert. With “high stringency”conditions, nucleic acid base pairing will occur only between nucleicacid fragments that have a high frequency of complementary basesequences (e.g., hybridization under “high stringency” conditions mayoccur between homologs with about 85-100% identity, preferably about70-100% identity). With medium stringency conditions, nucleic acid basepairing will occur between nucleic acids with an intermediate frequencyof complementary base sequences (e.g., hybridization under “mediumstringency” conditions may occur between homologs with about 50-70%identity). Thus, conditions of “weak” or “low” stringency are oftenrequired with nucleic acids that are derived from organisms that aregenetically diverse, as the frequency of complementary sequences isusually less.

“High stringency conditions” when used in reference to nucleic acidhybridization comprise conditions equivalent to binding or hybridizationat 42° C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/lNaH₂PO₄ H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS,5× Denhardt's reagent and 100 μg/ml denatured salmon sperm DNA followedby washing in a solution comprising 0.1×SSPE, 1.0% SDS at 42° C. when aprobe of about 500 nucleotides in length is employed.

“Medium stringency conditions” when used in reference to nucleic acidhybridization comprise conditions equivalent to binding or hybridizationat 42° C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/lNaH₂PO₄ H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS,5× Denhardt's reagent and 100 μg/ml denatured salmon sperm DNA followedby washing in a solution comprising 1.0×SSPE, 1.0% SDS at 42° C. when aprobe of about 500 nucleotides in length is employed. “Low stringencyconditions” comprise conditions equivalent to binding or hybridizationat 42° C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/lNaH₂PO₄ H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS,5× Denhardt's reagent [50× Denhardt's contains per 500 ml: 5 g Ficoll(Type 400, Pharamcia), 5 g BSA (Fraction V; Sigma)] and 100 μg/mldenatured salmon sperm DNA followed by washing in a solution comprising5×SSPE, 0.1% SDS at 42° C. when a probe of about 500 nucleotides inlength is employed.

The present invention is not limited to the hybridization of probes ofabout 500 nucleotides in length. The present invention contemplates theuse of probes between approximately 10 nucleotides up to severalthousand (e.g., at least 5000) nucleotides in length. One skilled in therelevant understands that stringency conditions may be altered forprobes of other sizes (See e.g., Anderson and Young, Quantitative FilterHybridization, in Nucleic Acid Hybridization [1985] and Sambrook et al.,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY[1989]).

The following terms are used to describe the sequence relationshipsbetween two or more polynucleotides: “reference sequence”, “sequenceidentity”, “percentage of sequence identity”, and “substantialidentity”. A “reference sequence” is a defined sequence used as a basisfor a sequence comparison; a reference sequence may be a subset of alarger sequence, for example, as a segment of a full-length cDNAsequence given in a sequence listing or may comprise a complete genesequence. Generally, a reference sequence is at least 20 nucleotides inlength, frequently at least 25 nucleotides in length, and often at least50 nucleotides in length. Since two polynucleotides may each (1)comprise a sequence (i.e., a portion of the complete polynucleotidesequence) that is similar between the two polynucleotides, and (2) mayfurther comprise a sequence that is divergent between the twopolynucleotides, sequence comparisons between two (or more)polynucleotides are typically performed by comparing sequences of thetwo polynucleotides over a “comparison window” to identify and comparelocal regions of sequence similarity. A “comparison window”, as usedherein, refers to a conceptual segment of at least 20 contiguousnucleotide positions wherein a polynucleotide sequence may be comparedto a reference sequence of at least 20 contiguous nucleotides andwherein the portion of the polynucleotide sequence in the comparisonwindow may comprise additions or deletions (i.e., gaps) of 20 percent orless as compared to the reference sequence (which does not compriseadditions or deletions) for optimal alignment of the two sequences.Optimal alignment of sequences for aligning a comparison window may beconducted by the local homology algorithm of Smith and Waterman [Smithand Waterman, Adv. Appl. Math. 2: 482 (1981)] by the homology alignmentalgorithm of Needleman and Wunsch [Needleman and Wunsch, J. Mol. Biol.48:443 (1970)], by the search for similarity method of Pearson andLipman [Pearson and Lipman, Proc. Natl. Acad. Sci. (U.S.A.) 85:2444(1988)], by computerized implementations of these algorithms (GAP,BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software PackageRelease 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.),or by inspection, and the best alignment (i.e., resulting in the highestpercentage of homology over the comparison window) generated by thevarious methods is selected. The term “sequence identity” means that twopolynucleotide sequences are identical (i.e., on anucleotide-by-nucleotide basis) over the window of comparison. The term“percentage of sequence identity” is calculated by comparing twooptimally aligned sequences over the window of comparison, determiningthe number of positions at which the identical nucleic acid base (e.g.,A, T, C, G, U, or I) occurs in both sequences to yield the number ofmatched positions, dividing the number of matched positions by the totalnumber of positions in the window of comparison (i.e., the window size),and multiplying the result by 100 to yield the percentage of sequenceidentity. The terms “substantial identity” as used herein denotes acharacteristic of a polynucleotide sequence, wherein the polynucleotidecomprises a sequence that has at least 85 percent sequence identity,preferably at least 90 to 95 percent sequence identity, more usually atleast 99 percent sequence identity as compared to a reference sequenceover a comparison window of at least 20 nucleotide positions, frequentlyover a window of at least 25-50 nucleotides, wherein the percentage ofsequence identity is calculated by comparing the reference sequence tothe polynucleotide sequence which may include deletions or additionswhich total 20 percent or less of the reference sequence over the windowof comparison. The reference sequence may be a subset of a largersequence, for example, as a segment of the full-length sequences of thecompositions claimed in the present invention (e.g., NOD).

As applied to polypeptides, the term “substantial identity” means thattwo peptide sequences, when optimally aligned, such as by the programsGAP or BESTFIT using default gap weights, share at least 80 percentsequence identity, preferably at least 90 percent sequence identity,more preferably at least 95 percent sequence identity or more (e.g., 99percent sequence identity). Preferably, residue positions that are notidentical differ by conservative amino acid substitutions. Conservativeamino acid substitutions refer to the interchangeability of residueshaving similar side chains. For example, a group of amino acids havingaliphatic side chains is glycine, alanine, valine, leucine, andisoleucine; a group of amino acids having aliphatic-hydroxyl side chainsis serine and threonine; a group of amino acids having amide-containingside chains is asparagine and glutamine; a group of amino acids havingaromatic side chains is phenylalanine, tyrosine, and tryptophan; a groupof amino acids having basic side chains is lysine, arginine, andhistidine; and a group of amino acids having sulfur-containing sidechains is cysteine and methionine. Preferred conservative amino acidssubstitution groups are: valine-leucine-isoleucine,phenylalanine-tyrosine, lysine-arginine, alanine-valine, andasparagine-glutamine.

The term “fragment” as used herein refers to a polypeptide that has anamino-terminal and/or carboxy-terminal deletion as compared to thenative protein, but where the remaining amino acid sequence is identicalto the corresponding positions in the amino acid sequence deduced from afull-length cDNA sequence. Fragments typically are at least 4 aminoacids long, preferably at least 20 amino acids long, usually at least 50amino acids long or longer, and span the portion of the polypeptiderequired for intermolecular binding of the compositions (claimed in thepresent invention) with its various ligands and/or substrates.

The term “polymorphic locus” is a locus present in a population thatshows variation between members of the population (i.e., the most commonallele has a frequency of less than 0.95). In contrast, a “monomorphiclocus” is a genetic locus at little or no variations seen betweenmembers of the population (generally taken to be a locus at which themost common allele exceeds a frequency of 0.95 in the gene pool of thepopulation).

As used herein, the term “genetic variation information” or “geneticvariant information” refers to the presence or absence of one or morevariant nucleic acid sequences (e.g., polymorphism or mutations) in agiven allele of a particular gene (e.g., a NOD gene of the presentinvention).

As used herein, the term “detection assay” refers to an assay fordetecting the presence or absence of variant nucleic acid sequences(e.g., polymorphisms or mutations) in a given allele of a particulargene (e.g., a NOD gene). Examples of suitable detection assays include,but are not limited to, those described below in Section III B.

The term “naturally-occurring” as used herein as applied to an objectrefers to the fact that an object can be found in nature. For example, apolypeptide or polynucleotide sequence that is present in an organism(including viruses) that can be isolated from a source in nature andwhich has not been intentionally modified by man in the laboratory isnaturally-occurring.

“Amplification” is a special case of nucleic acid replication involvingtemplate specificity. It is to be contrasted with non-specific templatereplication (i.e., replication that is template-dependent but notdependent on a specific template). Template specificity is heredistinguished from fidelity of replication (i.e., synthesis of theproper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-)specificity. Template specificity is frequently described in terms of“target” specificity. Target sequences are “targets” in the sense thatthey are sought to be sorted out from other nucleic acid. Amplificationtechniques have been designed primarily for this sorting out.

Template specificity is achieved in most amplification techniques by thechoice of enzyme. Amplification enzymes are enzymes that, underconditions they are used, will process only specific sequences ofnucleic acid in a heterogeneous mixture of nucleic acid. For example, inthe case of Qβ replicase, MDV-1 RNA is the specific template for thereplicase (D. L. Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038[1972]). Other nucleic acid will not be replicated by this amplificationenzyme. Similarly, in the case of T7 RNA polymerase, this amplificationenzyme has a stringent specificity for its own promoters (Chamberlin etal., Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzymewill not ligate the two oligonucleotides or polynucleotides, where thereis a mismatch between the oligonucleotide or polynucleotide substrateand the template at the ligation junction (D. Y. Wu and R. B. Wallace,Genomics 4:560 [1989]). Finally, Taq and Pfu polymerases, by virtue oftheir ability to function at high temperature, are found to display highspecificity for the sequences bounded and thus defined by the primers;the high temperature results in thermodynamic conditions that favorprimer hybridization with the target sequences and not hybridizationwith non-target sequences (H. A. Erlich (ed.), PCR Technology, StocktonPress [1989]).

As used herein, the term “amplifiable nucleic acid” is used in referenceto nucleic acids that may be amplified by any amplification method. Itis contemplated that “amplifiable nucleic acid” will usually comprise“sample template.”

As used herein, the term “sample template” refers to nucleic acidoriginating from a sample that is analyzed for the presence of “target”(defined below). In contrast, “background template” is used in referenceto nucleic acid other than sample template that may or may not bepresent in a sample. Background template is most often inadvertent. Itmay be the result of carryover, or it may be due to the presence ofnucleic acid contaminants sought to be purified away from the sample.For example, nucleic acids from organisms other than those to bedetected may be present as background in a test sample.

As used herein, the term “primer” refers to an oligonucleotide, whetheroccurring naturally as in a purified restriction digest or producedsynthetically, which is capable of acting as a point of initiation ofsynthesis when placed under conditions in which synthesis of a primerextension product which is complementary to a nucleic acid strand isinduced, (i.e., in the presence of nucleotides and an inducing agentsuch as DNA polymerase and at a suitable temperature and pH). The primeris preferably single stranded for maximum efficiency in amplification,but may alternatively be double stranded. If double stranded, the primeris first treated to separate its strands before being used to prepareextension products. Preferably, the primer is anoligodeoxyribonucleotide. The primer must be sufficiently long to primethe synthesis of extension products in the presence of the inducingagent. The exact lengths of the primers will depend on many factors,including temperature, source of primer and the use of the method.

As used herein, the term “probe” refers to an oligonucleotide (i.e., asequence of nucleotides), whether occurring naturally as in a purifiedrestriction digest or produced synthetically, recombinantly or by PCRamplification, that is capable of hybridizing to another oligonucleotideof interest. A probe may be single-stranded or double-stranded. Probesare useful in the detection, identification and isolation of particulargene sequences. It is contemplated that any probe used in the presentinvention will be labeled with any “reporter molecule,” so that isdetectable in any detection system, including, but not limited to enzyme(e.g., ELISA, as well as enzyme-based histochemical assays),fluorescent, radioactive, and luminescent systems. It is not intendedthat the present invention be limited to any particular detection systemor label.

As used herein, the term “target,” refers to a nucleic acid sequence orstructure to be detected or characterized. Thus, the “target” is soughtto be sorted out from other nucleic acid sequences. A “segment” isdefined as a region of nucleic acid within the target sequence.

As used herein, the term “polymerase chain reaction” (“PCR”) refers tothe method of K. B. Mullis U.S. Pat. Nos. 4,683,195, 4,683,202, and4,965,188, hereby incorporated by reference, that describe a method forincreasing the concentration of a segment of a target sequence in amixture of genomic DNA without cloning or purification. This process foramplifying the target sequence consists of introducing a large excess oftwo oligonucleotide primers to the DNA mixture containing the desiredtarget sequence, followed by a precise sequence of thermal cycling inthe presence of a DNA polymerase. The two primers are complementary totheir respective strands of the double stranded target sequence. Toeffect amplification, the mixture is denatured and the primers thenannealed to their complementary sequences within the target molecule.Following annealing, the primers are extended with a polymerase so as toform a new pair of complementary strands. The steps of denaturation,primer annealing, and polymerase extension can be repeated many times(i.e., denaturation, annealing and extension constitute one “cycle”;there can be numerous “cycles”) to obtain a high concentration of anamplified segment of the desired target sequence. The length of theamplified segment of the desired target sequence is determined by therelative positions of the primers with respect to each other, andtherefore, this length is a controllable parameter. By virtue of therepeating aspect of the process, the method is referred to as the“polymerase chain reaction” (hereinafter “PCR”). Because the desiredamplified segments of the target sequence become the predominantsequences (in terms of concentration) in the mixture, they are said tobe “PCR amplified.”

With PCR, it is possible to amplify a single copy of a specific targetsequence in genomic DNA to a level detectable by several differentmethodologies (e.g., hybridization with a labeled probe; incorporationof biotinylated primers followed by avidin-enzyme conjugate detection;incorporation of ³²P-labeled deoxynucleotide triphosphates, such as dCTPor dATP, into the amplified segment). In addition to genomic DNA, anyoligonucleotide or polynucleotide sequence can be amplified with theappropriate set of primer molecules. In particular, the amplifiedsegments created by the PCR process itself are, themselves, efficienttemplates for subsequent PCR amplifications.

As used herein, the terms “PCR product,” “PCR fragment,” and“amplification product” refer to the resultant mixture of compoundsafter two or more cycles of the PCR steps of denaturation, annealing andextension are complete. These terms encompass the case where there hasbeen amplification of one or more segments of one or more targetsequences.

As used herein, the term “amplification reagents” refers to thosereagents (deoxyribonucleotide triphosphates, buffer, etc.), needed foramplification except for primers, nucleic acid template, and theamplification enzyme. Typically, amplification reagents along with otherreaction components are placed and contained in a reaction vessel (testtube, microwell, etc.).

As used herein, the terms “restriction endonucleases” and “restrictionenzymes” refer to bacterial enzymes, each of which cut double-strandedDNA at or near a specific nucleotide sequence.

As used herein, the term “recombinant DNA molecule” as used hereinrefers to a DNA molecule that is comprised of segments of DNA joinedtogether by means of molecular biological techniques.

As used herein, the term “antisense” is used in reference to RNAsequences that are complementary to a specific RNA sequence (e.g.,mRNA). Included within this definition are antisense RNA (“asRNA”)molecules involved in gene regulation by bacteria. Antisense RNA may beproduced by any method, including synthesis by splicing the gene(s) ofinterest in a reverse orientation to a viral promoter that permits thesynthesis of a coding strand. Once introduced into an embryo, thistranscribed strand combines with natural mRNA produced by the embryo toform duplexes. These duplexes then block either the furthertranscription of the mRNA or its translation. In this manner, mutantphenotypes may be generated. The term “antisense strand” is used inreference to a nucleic acid strand that is complementary to the “sense”strand. The designation (−) (i.e., “negative”) is sometimes used inreference to the antisense strand, with the designation (+) sometimesused in reference to the sense (i.e., “positive”) strand.

The term “isolated” when used in relation to a nucleic acid, as in “anisolated oligonucleotide” or “isolated polynucleotide” refers to anucleic acid sequence that is identified and separated from at least onecontaminant nucleic acid with which it is ordinarily associated in itsnatural source. Isolated nucleic acid is present in a form or settingthat is different from that in which it is found in nature. In contrast,non-isolated nucleic acids are nucleic acids such as DNA and RNA foundin the state they exist in nature. For example, a given DNA sequence(e.g., a gene) is found on the host cell chromosome in proximity toneighboring genes; RNA sequences, such as a specific mRNA sequenceencoding a specific protein, are found in the cell as a mixture withnumerous other mRNAs that encode a multitude of proteins. However,isolated nucleic acid encoding NOD includes, by way of example, suchnucleic acid in cells ordinarily expressing NOD where the nucleic acidis in a chromosomal location different from that of natural cells, or isotherwise flanked by a different nucleic acid sequence than that foundin nature. The isolated nucleic acid, oligonucleotide, or polynucleotidemay be present in single-stranded or double-stranded form. When anisolated nucleic acid, oligonucleotide or polynucleotide is to beutilized to express a protein, the oligonucleotide or polynucleotidewill contain at a minimum the sense or coding strand (i.e., theoligonucleotide or polynucleotide may single-stranded), but may containboth the sense and anti-sense strands (i.e., the oligonucleotide orpolynucleotide may be double-stranded).

As used herein, a “portion of a chromosome” refers to a discrete sectionof the chromosome. Chromosomes are divided into sites or sections bycytogeneticists as follows: the short (relative to the centromere) armof a chromosome is termed the “p” arm; the long arm is termed the “q”arm. Each arm is then divided into 2 regions termed region 1 and region2 (region 1 is closest to the centromere). Each region is furtherdivided into bands. The bands may be further divided into sub-bands. Forexample, the 11p15.5 portion of human chromosome 11 is the portionlocated on chromosome 11 (11) on the short arm (p) in the first region(1) in the 5th band (5) in sub-band 5 (0.5). A portion of a chromosomemay be “altered;” for instance the entire portion may be absent due to adeletion or may be rearranged (e.g., inversions, translocations,expanded or contracted due to changes in repeat regions). In the case ofa deletion, an attempt to hybridize (i.e., specifically bind) a probehomologous to a particular portion of a chromosome could result in anegative result (i.e., the probe could not bind to the sample containinggenetic material suspected of containing the missing portion of thechromosome). Thus, hybridization of a probe homologous to a particularportion of a chromosome may be used to detect alterations in a portionof a chromosome.

The term “sequences associated with a chromosome” means preparations ofchromosomes (e.g., spreads of metaphase chromosomes), nucleic acidextracted from a sample containing chromosomal DNA (e.g., preparationsof genomic DNA); the RNA that is produced by transcription of geneslocated on a chromosome (e.g., hnRNA and mRNA), and cDNA copies of theRNA transcribed from the DNA located on a chromosome. Sequencesassociated with a chromosome may be detected by numerous techniquesincluding probing of Southern and Northern blots and in situhybridization to RNA, DNA, or metaphase chromosomes with probescontaining sequences homologous to the nucleic acids in the above listedpreparations.

As used herein the term “portion” when in reference to a nucleotidesequence (as in “a portion of a given nucleotide sequence”) refers tofragments of that sequence. The fragments may range in size from fournucleotides to the entire nucleotide sequence minus one nucleotide (10nucleotides, 20, 30, 40, 50, 100, 200, etc.).

As used herein the term “coding region” when used in reference tostructural gene refers to the nucleotide sequences that encode the aminoacids found in the nascent polypeptide as a result of translation of amRNA molecule. The coding region is bounded, in eukaryotes, on the 5′side by the nucleotide triplet “ATG” that encodes the initiatormethionine and on the 3′ side by one of the three triplets, whichspecify stop codons (i.e., TAA, TAG, TGA).

As used herein, the term “purified” or “to purify” refers to the removalof contaminants from a sample. For example, NOD antibodies are purifiedby removal of contaminating non-immunoglobulin proteins; they are alsopurified by the removal of immunoglobulin that does not bind a NODpolypeptide. The removal of non-immunoglobulin proteins and/or theremoval of immunoglobulins that do not bind a NOD polypeptide results inan increase in the percent of NOD-reactive immunoglobulins in thesample. In another example, recombinant NOD polypeptides are expressedin bacterial host cells and the polypeptides are purified by the removalof host cell proteins; the percent of recombinant NOD polypeptides isthereby increased in the sample.

The term “recombinant DNA molecule” as used herein refers to a DNAmolecule that is comprised of segments of DNA joined together by meansof molecular biological techniques.

The term “recombinant protein” or “recombinant polypeptide” as usedherein refers to a protein molecule that is expressed from a recombinantDNA molecule.

The term “native protein” as used herein, is used to indicate a proteinthat does not contain amino acid residues encoded by vector sequences;that is the native protein contains only those amino acids found in theprotein as it occurs in nature. A native protein may be produced byrecombinant means or may be isolated from a naturally occurring source.

As used herein the term “portion” when in reference to a protein (as in“a portion of a given protein”) refers to fragments of that protein. Thefragments may range in size from four consecutive amino acid residues tothe entire amino acid sequence minus one amino acid.

The term “Southern blot,” refers to the analysis of DNA on agarose oracrylamide gels to fractionate the DNA according to size followed bytransfer of the DNA from the gel to a solid support, such asnitrocellulose or a nylon membrane. The immobilized DNA is then probedwith a labeled probe to detect DNA species complementary to the probeused. The DNA may be cleaved with restriction enzymes prior toelectrophoresis. Following electrophoresis, the DNA may be partiallydepurinated and denatured prior to or during transfer to the solidsupport. Southern blots are a standard tool of molecular biologists (J.Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold SpringHarbor Press, NY, pp 9.31-9.58 [1989]).

The term “Northern blot,” as used herein refers to the analysis of RNAby electrophoresis of RNA on agarose gels to fractionate the RNAaccording to size followed by transfer of the RNA from the gel to asolid support, such as nitrocellulose or a nylon membrane. Theimmobilized RNA is then probed with a labeled probe to detect RNAspecies complementary to the probe used. Northern blots are a standardtool of molecular biologists (J. Sambrook, et al., supra, pp 7.39-7.52[1989]).

The term “Western blot” refers to the analysis of protein(s) (orpolypeptides) immobilized onto a support such as nitrocellulose or amembrane. The proteins are run on acrylamide gels to separate theproteins, followed by transfer of the protein from the gel to a solidsupport, such as nitrocellulose or a nylon membrane. The immobilizedproteins are then exposed to antibodies with reactivity against anantigen of interest. The binding of the antibodies may be detected byvarious methods, including the use of radiolabeled antibodies.

The term “antigenic determinant” as used herein refers to that portionof an antigen that makes contact with a particular antibody (i.e., anepitope). When a protein or fragment of a protein is used to immunize ahost animal, numerous regions of the protein may induce the productionof antibodies that bind specifically to a given region orthree-dimensional structure on the protein; these regions or structuresare referred to as antigenic determinants. An antigenic determinant maycompete with the intact antigen (i.e., the “immunogen” used to elicitthe immune response) for binding to an antibody.

The term “transgene” as used herein refers to a foreign, heterologous,or autologous gene that is placed into an organism by introducing thegene into newly fertilized eggs or early embryos. The term “foreigngene” refers to any nucleic acid (e.g., gene sequence) that isintroduced into the genome of an animal by experimental manipulationsand may include gene sequences found in that animal so long as theintroduced gene does not reside in the same location as does thenaturally-occurring gene. The term “autologous gene” is intended toencompass variants (e.g., polymorphisms or mutants) of the naturallyoccurring gene. The term transgene thus encompasses the replacement ofthe naturally occurring gene with a variant form of the gene.

As used herein, the term “vector” is used in reference to nucleic acidmolecules that transfer DNA segment(s) from one cell to another. Theterm “vehicle” is sometimes used interchangeably with “vector.”

The term “expression vector” as used herein refers to a recombinant DNAmolecule containing a desired coding sequence and appropriate nucleicacid sequences necessary for the expression of the operably linkedcoding sequence in a particular host organism. Nucleic acid sequencesnecessary for expression in prokaryotes usually include a promoter, anoperator (optional), and a ribosome binding site, often along with othersequences. Eukaryotic cells are known to utilize promoters, enhancers,and termination and polyadenylation signals.

As used herein, the term “host cell” refers to any eukaryotic orprokaryotic cell (e.g., bacterial cells such as E. coli, yeast cells,mammalian cells, avian cells, amphibian cells, plant cells, fish cells,and insect cells), whether located in vitro or in vivo. For example,host cells may be located in a transgenic animal.

The terms “overexpression” and “overexpressing” and grammaticalequivalents, are used in reference to levels of mRNA to indicate a levelof expression approximately 3-fold higher than that typically observedin a given tissue in a control or non-transgenic animal. Levels of mRNAare measured using any of a number of techniques known to those skilledin the art including, but not limited to Northern blot analysis (See,Example 10, for a protocol for performing Northern blot analysis).Appropriate controls are included on the Northern blot to control fordifferences in the amount of RNA loaded from each tissue analyzed (e.g.,the amount of 28S rRNA, an abundant RNA transcript present atessentially the same amount in all tissues, present in each sample canbe used as a means of normalizing or standardizing the RAD50mRNA-specific signal observed on Northern blots). The amount of mRNApresent in the band corresponding in size to the correctly spliced NODtransgene RNA is quantified; other minor species of RNA which hybridizeto the transgene probe are not considered in the quantification of theexpression of the transgenic mRNA.

The term “transfection” as used herein refers to the introduction offoreign DNA into eukaryotic cells. Transfection may be accomplished by avariety of means known to the art including calcium phosphate-DNAco-precipitation, DEAE-dextran-mediated transfection, polybrene-mediatedtransfection, electroporation, microinjection, liposome fusion,lipofection, protoplast fusion, retroviral infection, and biolistics.

The term “stable transfection” or “stably transfected” refers to theintroduction and integration of foreign DNA into the genome of thetransfected cell. The term “stable transfectant” refers to a cell thathas stably integrated foreign DNA into the genomic DNA.

The term “transient transfection” or “transiently transfected” refers tothe introduction of foreign DNA into a cell where the foreign DNA failsto integrate into the genome of the transfected cell. The foreign DNApersists in the nucleus of the transfected cell for several days. Duringthis time the foreign DNA is subject to the regulatory controls thatgovern the expression of endogenous genes in the chromosomes. The term“transient transfectant” refers to cells that have taken up foreign DNAbut have failed to integrate this DNA.

The term “calcium phosphate co-precipitation” refers to a technique forthe introduction of nucleic acids into a cell. The uptake of nucleicacids by cells is enhanced when the nucleic acid is presented as acalcium phosphate-nucleic acid co-precipitate. The original technique ofGraham and van der Eb (Graham and van der Eb, Virol., 52:456 [1973]),has been modified by several groups to optimize conditions forparticular types of cells. The art is well aware of these numerousmodifications.

A “composition comprising a given polynucleotide sequence” as usedherein refers broadly to any composition containing the givenpolynucleotide sequence. The composition may comprise an aqueoussolution. Compositions comprising polynucleotide sequences encoding NODs(e.g., SEQ ID NOs:1-11) or fragments thereof may be employed ashybridization probes. In this case, the NOD encoding polynucleotidesequences are typically employed in an aqueous solution containing salts(e.g., NaCl), detergents (e.g., SDS), and other components (e.g.,Denhardt's solution, dry milk, salmon sperm DNA, etc.).

The term “test compound” refers to any chemical entity, pharmaceutical,drug, and the like that can be used to treat or prevent a disease,illness, sickness, or disorder of bodily function, or otherwise alterthe physiological or cellular status of a sample. Test compoundscomprise both known and potential therapeutic compounds. A test compoundcan be determined to be therapeutic by screening using the screeningmethods of the present invention. A “known therapeutic compound” refersto a therapeutic compound that has been shown (e.g., through animaltrials or prior experience with administration to humans) to beeffective in such treatment or prevention.

The term “sample” as used herein is used in its broadest sense. A samplesuspected of containing a human chromosome or sequences associated witha human chromosome may comprise a cell, chromosomes isolated from a cell(e.g., a spread of metaphase chromosomes), genomic DNA (in solution orbound to a solid support such as for Southern blot analysis), RNA (insolution or bound to a solid support such as for Northern blotanalysis), cDNA (in solution or bound to a solid support) and the like.A sample suspected of containing a protein may comprise a cell, aportion of a tissue, an extract containing one or more proteins and thelike.

As used herein, the term “response,” when used in reference to an assay,refers to the generation of a detectable signal (e.g., accumulation ofreporter protein, increase in ion concentration, accumulation of adetectable chemical product).

As used herein, the term “reporter gene” refers to a gene encoding aprotein that may be assayed. Examples of reporter genes include, but arenot limited to, luciferase (See, e.g., deWet et al., Mol. Cell. Biol.7:725 [1987] and U.S. Pat Nos. 6,074,859; 5,976,796; 5,674,713; and5,618,682; all of which are incorporated herein by reference), greenfluorescent protein (e.g., GenBank Accession Number U43284; a number ofGFP variants are commercially available from CLONTECH Laboratories, PaloAlto, Calif.), chloramphenicol acetyltransferase, β-galactosidase,alkaline phosphatase, and horse radish peroxidase.

As used herein, the terms “computer memory” and “computer memory device”refer to any storage media readable by a computer processor. Examples ofcomputer memory include, but are not limited to, RAM, ROM, computerchips, digital video disc (DVDs), compact discs (CDs), hard disk drives(HDD), and magnetic tape.

As used herein, the term “computer readable medium” refers to any deviceor system for storing and providing information (e.g., data andinstructions) to a computer processor. Examples of computer readablemedia include, but are not limited to, DVDs, CDs, hard disk drives,magnetic tape and servers for streaming media over networks.

As used herein, the term “entering” as in “entering said geneticvariation information into said computer” refers to transferringinformation to a “computer readable medium.” Information may betransferred by any suitable method, including but not limited to,manually (e.g., by typing into a computer) or automated (e.g.,transferred from another “computer readable medium” via a “processor”).

As used herein, the terms “processor” and “central processing unit” or“CPU” are used interchangeably and refer to a device that is able toread a program from a computer memory (e.g., ROM or other computermemory) and perform a set of steps according to the program.

As used herein, the term “computer implemented method” refers to amethod utilizing a “CPU” and “computer readable medium.”

GENERAL DESCRIPTION OF THE INVENTION

The nucleotide-binding oligomerization domain (NOD) was first found inApaf-1 and its nematode homologue CED-4, two pivotal regulators ofdevelopmental and p53-dependent programmed cell death (Lui andHengartner, supra; Derry et al., supra). Subsequently, twoNOD-containing molecules, NOD1 (CARD4) and NOD2, were identified throughdatabase searches for Apaf-1/CED-4 homologues. Since then, the NODprotein family has greatly expanded and currently contains a largenumber of proteins from animals, plants, fungi and bacteria,including >20 human proteins homologous to Apaf-1 and NOD1 (FIG. 1). Themajority of NOD family members are comprised of three distinctfunctional domains, an amino-terminal effector binding domain (EBD), acentrally located NOD and a carboxy-terminal ligand recognition domain(LRD) (Table 2). The NOD mediates self oligomerization, which, in someembodiments, function in the activation of downstream effectormolecules. The EBD of mammalian NOD proteins mediates the binding toeffector molecules which determines the downstream events activated uponsignaling, including apoptosis and NF-κB activation (Table 2).

Some NOD proteins share the same type of effector domain (e.g., CARD orPYD). In some embodiments, the NOD proteins activate different signalingcascades as the interaction between these domains and those present indownstream binding partners is highly specific. For example, the PYD ofASC, a downstream adaptor molecule involved in NOD signalling,associates with the PYD of cryopyrin, but not with the PYD present inNALP2, PAN2, PYPAF3, PYPAF4, PYPAF6 or NOD27 (Grenier et al., FEBS Lett.530, 73-78 (2002)). In other embodiments, certain NOD proteins like NOD1and NOD2 interact with and use a common downstream molecule, RICK, toactivate identical or similar signalling pathways (FIG. 3). Transientexpression of NOD1 and NOD2 in mammalian cells induces NF-κB activation(Bertin et al., J. Biol. Chem. 274, 12955-12958 (1999); Inohara et al.,J. Biol. Chem. 274, 14560-14567 (1999); Ogura et al., J. Biol. Chem.276, 4812-4818 (2001)). Mutational analyses demonstrated that the CARDsand the NODs of NOD 1 and NOD2 were required for the induction of NF-KBwhereas its LRRs were dispensable (Inohara et al., J. Biol. Chem. 274,14560-14567 (1999); Ogura et al., J. Biol. Chem. 276, 4812-4818 (2001)).Thus, in some embodiments, the CARDs act as effector domains for NOD 1and NOD2 signalling. Both NOD 1 and NOD2 physically associate with RICK,a CARD-containing protein kinase through homophilic CARD-CARDinteractions (Inohara et al., J. Biol.. Chem. 274, 14560-14567 (1999);Ogura et al., J. Biol. Chem. 276, 4812-4818 (2001)). A role for RICK inNOD1 and NOD2 signalling is supported by several studies (Inohara etal., supra; Ogura et al., supra).

Several NOD-LRR proteins, including IPAF, cryopyrin, and DEFCAP,associate with ASC (Manji et al., J. Biol. Chem. 277, 11570-11575(2002); Geddes et al., Biochem. Biophys. Res. Commun. 284, 77-82 (2001);Martinon et al., Mol. Cell. 10, 417-426 (2002)). ASC (also calledTMS1/PYCARD) is an adaptor molecule originally identified in asub-cytosolic fraction called the “speck” in cells undergoing apoptosis.ASC is composed of an amino-terminal PYD and a carboxy-terminal CARD.Co-expression of ASC with several PYD-containing NOD proteins includingcryopyrin, PYPAF5 or PYPAF7, as well as with the CARD-containing IPAF,induces NF-κB activation (Manji et al., supra). Thus, in someembodiments, PYD-containing NOD proteins use the adaptor ASC forsignaling (Grenier et al., supra). NF-κB activation induced through ASCsignalling is inhibited by dominant forms of NEMO/IKKγ (Manji et al.,supra; Grenier et al., supra). Thus, ASC signals, as was reported forRICK, through the common IKK signalling pathway of NF-κB activation(FIG. 3).

Multiple NOD proteins including NOD 1, NOD2, IPAF and DEFCAP promoteactivation of pro-inflammatory caspases. For example, NOD1 promotescaspase-1 activation in transient overexpression studies (Yoo et al.,Biochem. Biophys. Res. Commun. 299, 652-658 (2002)). IPAF, cryopyrin,DEFCAP, PYPAF5 and PYPAF7 have been found to regulate, in the presenceof ASC, the activation of caspase-1, interleukin-1β converting enzyme(Grenier et al., supra; Wang et al., J. Biol. Chem. 277, 29874-29880(2002)). DEFCAP, the only NOD family member known to possess both a CARDand PYD, can form an endogenous multi-protein complex containing ASC,caspase-1 and caspase-5 dubbed “the inflammasome” which promotes caspaseactivation and processing of pro-interleukin-1β (Martinon et al., Mol.Cell. 10, 417-426 (2002)).

In some embodiments, NOD proteins (e.g., Apaf-1, NOD 1, NOD2, DEFCAP,IPAF and cryopyrin) induce or enhance apoptosis (Inohara et al., J.Biol. Chem. 274, 14560-14567 (1999); Ogura et al., J. Biol. Chem. 276,4812-4818 (2001); Geddes et al., Biochem. Biophys. Res. Commun. 284,77-82 (2001); Poyet et al., J. Biol. Chem. 276, 28309-28313 (2001);Hlaing et al., J. Biol. Chem. 276, 9230-9238 (2001); Zou et al., Cell90, 405-413 (1997)). For example, NOD1 and DEFCAP interact with multiplecaspases and/or Apaf-1 (Hlaing et al., supra; Inohara and Nuñez,Oncogene, 20, 6473-6481 (2001)). Co-expression of IPAF or cryopyrin withASC or forced oligomerization of IPAF or cryopyrin induces apoptosis inmammalian cells, which requires caspase activity. NOD 1, IPAF,cryopyrin, PYPAF5 and PYPAF7 induce both NF-κB and caspase-1 activation.Thus, in some embodiments, NOD pro-apoptotic activity results from theactivation of inflammatory caspases. In other embodiments, apoptoticcaspases contribute to the activation of inflammatory caspases.

In some embodiments, the induction of both NF-κB and apoptosis by NODproteins is similar to that observed with TLRs, PKR and death receptors(DRs), which induce apoptosis through the activation of caspases. UponDR signalling, the induction of apoptosis is suppressed in vivo bysimultaneous activation of NF-KB, which leads to the expression ofanti-apoptotic genes (Beg and Baltimore, Science 274, 782-784 (1996);Wang et al., Science 281, 1680-1683 (1998); Micheau et al., Mol. Cell.Biol. 21, 5299-5305 (2001)). Thus, in some embodiments, underphysiological conditions, the pro-apoptotic activity induced through NODproteins is suppressed by simultaneous induction of NF-κB activity.

Genetic variation in three human NOD proteins has been implicated in thedevelopment of genetic diseases (Hull et al., Curr Opin Rheumatol. 15,61-69 (2003)). For example, mutations in CIITA are known to cause typeII lymphocyte bare syndrome (LBS), a hereditary immunodeficiencydisorder characterized by the absence of MHCII expression (Steimle etal., Cell 75, 135-146 (1993); Reith and Mach, Annu Rev Immunol. 19,331-373. (2001)). More recently, mutations in NOD2 and CIAS1 (the geneencoding cryopyrin) have been implicated in several autoinflammatorydiseases. A frameshift mutation, L1007fsinsC, and two missense mutations(G908R and R702W) in NOD2 are associated with Crohn's disease (CD), acommon inflammatory disease of the intestinal tract (Ogura et al.,Nature 411, 603-606 (2001); Hugot et al., Nature 411, 599-603 (2001);Hampe et al., Lancet 357, 1925-1928 (2001)). Having one copy of themutated alleles confers a 2-4-fold increased risk of developing CD,whereas homozygocity or compound heterozygocity for NOD2 mutationsincreases the risk 20-40-fold, indicating that lack of NOD2 function isimportant for disease development. All three CD-associated mutationsresult in proteins that are deficient in inducing PGN- and MDP-mediatedNF-κB activation. Activation of NF-κB induced by MDP is absent inmononuclear cells derived from CD patients homozygous for L1007fsinsC.

In addition to CD, missense mutations in the coding region of NOD2 havebeen associated with Blau syndrome, an autosomal dominant traitcharacterized by arthritis, uveitis and skin rashes (Miceli-Richard etal., Nat. Genet. 29, 19-20 (2001)). NOD2 mutations resulting in Blausyndrome are located in the NOD (Miceli-Richard et al., supra). NOD2mutant proteins found in patients with Blau syndrome induce increasedbasal NF-κB activity, when compared to wild-type NOD2. Thus, variantproteins found in patients with Blau syndrome may representconstitutively active NOD2 mutations. This is in contrast toCD-associated NOD2 variants, which have normal or reduced levels ofbasal activity but are defective in their response bacterial components(Ogura et al., Nature 411, 603-606 (2001); Bonen et al.,Gastroenterology 124, 140-146 (2003)).

Mutations in the CIAS1 gene, which encodes cryopyrin, are the cause ofseveral autoinflammatory syndromes characterized by recurrent episodesof seemingly unprovoked inflammation (Hoffman et al., Nature Genet. 29,301-305 (2001); Feldmann et al., Am. J Hum. Genet. 71, 198-203 (2002);Aksentijevich et al., Arthritis Rheum. 46, 3340-3348 (2002); Aganna etal., Arthritis Rheum., 46, 2445-2452 (2002)). These autosomal-dominantdiseases include familial cold autoinflammatory syndrome (FACS),Muckle-Wells syndrome (MWS) and neonatal-onset multisystem inflammatorydisease (NOMID, also known as chronic infantile neurologic cutaneousarticular syndrome or CINCA). Patients with FACS, MWS and NOMID carrymissense mutations that localize to the NOD of cryopyrin. The R260Wmutation associated with FACS and MWS corresponds to the R334W NOD2mutation found in Blau syndrome (Miceli-Richard et al., supra). Thepresent invention is not limited to a particular mechanism. Indeed, anunderstanding of the mechanism of the present invention is not requiredto practice the present invention. Nonetheless, it is contemplated thatthis observation suggests that R206W cryopyrin may represent aconstitutively active mutation which may lead to a deregulatedactivation of NF-κB and inflammatory caspases (FIG. 5).

Pyrin has been implicated in familial Mediterranean fever (FMF), anautosomal-recessive disease characterized by recurrent episodes of feverand localized inflammation (The International FMF Consortium, Cell 90,797-807 (1997)). The gene mutated in FMF encodes a protein called pyrin,which is composed of an amino-terminal PYD, a B-type zinc-finger box, acoiled coil, a PRY domain and a Spla and Ryanodine receptor (SPRY)domain (The International FMF Consortium, supra).

In some embodiments, the present invention provides novel NOD genes(e.g., those described in SEQ ID NOs: 1-22 and Table 1). The novel NODgenes of the present invention were identified by searching public genedatabases for proteins with homology to known NOD proteins. The presentinvention is not limited to a particular mechanism. Indeed, anunderstanding of the mechanism of the present invention is not necessaryto understand the present invention. Nonetheless, it is contemplatedthat these genes are associated with inflammatory diseases. Inparticular, analysis conducted during the course of development of thepresent invention revealed that linkage analysis of NOD27 revealed alocus in the chromosomal region that is associated with psoriasis.Accordingly, it is further contemplated that NOD27 is associated withpsoriasis.

In some embodiments, the present invention provides an “expressionprofile” of inflammatory diseases. For example, in some embodiments, theexpression and or presence of variant alleles of the NOD proteins of thepresent invention is determined. Such expression profiles can then becorrelated with disease states or susceptibility to disease.

DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to the NOD proteins and nucleic acidsencoding the NOD proteins. The present invention further provides assaysfor the detection of NOD polymorphisms and mutations associated withdisease states. Exemplary embodiments of the present invention aredescribed below.

I. NOD Polynucleotides

As described above, the present invention provides novel NOD familygenes. Accordingly, the present invention provides nucleic acidsencoding NOD genes, homologs, variants (e.g., polymorphisms andmutants), including but not limited to, those described in SEQ ID NOs:1-11. Table 1 describes the NOD genes of the present invention. In someembodiments, the present invention provide polynucleotide sequences thatare capable of hybridizing to SEQ ID NOs: 1-11 under conditions of lowto high stringency as long as the polynucleotide sequence capable ofhybridizing encodes a protein that retains a biological activity of thenaturally occurring NODs. In some embodiments, the protein that retainsa biological activity of naturally occurring NOD is 70% homologous towild-type NOD, preferably 80% homologous to wild-type NOD, morepreferably 90% homologous to wild-type NOD, and most preferably 95%homologous to wild-type NOD. In preferred embodiments, hybridizationconditions are based on the melting temperature (T_(m)) of the nucleicacid binding complex and confer a defined “stringency” as explainedabove (See e.g., Wahl, et al., Meth. Enzymol., 152:399-407 [1987],incorporated herein by reference).

In other embodiments of the present invention, additional alleles of NODgenes are provided. In preferred embodiments, alleles result from apolymorphism or mutation (i.e., a change in the nucleic acid sequence)and generally produce altered mRNAs or polypeptides whose structure orfunction may or may not be altered. Any given gene may have none, one ormany allelic forms. Common mutational changes that give rise to allelesare generally ascribed to deletions, additions or substitutions ofnucleic acids. Each of these types of changes may occur alone, or incombination with the others, and at the rate of one or more times in agiven sequence. Examples of the alleles of the present invention includethose encoded by SEQ ID NOs: 1-11 (wild type) and disease allelesthereof.

In still other embodiments of the present invention, the nucleotidesequences of the present invention may be engineered in order to alteran NOD coding sequence for a variety of reasons, including but notlimited to, alterations which modify the cloning, processing and/orexpression of the gene product. For example, mutations may be introducedusing techniques that are well known in the art (e.g., site-directedmutagenesis to insert new restriction sites, to alter glycosylationpatterns, to change codon preference, etc.).

In some embodiments of the present invention, the polynucleotidesequence of NOD may be extended utilizing the nucleotide sequence (e.g.,SEQ ID NOs: 1-I1) in various methods known in the art to detect upstreamsequences such as promoters and regulatory elements. For example, it iscontemplated that restriction-site polymerase chain reaction (PCR) willfind use in the present invention. This is a direct method that usesuniversal primers to retrieve unknown sequence adjacent to a known locus(Gobinda et al., PCR Methods Applic., 2:318-22 [1993]). First, genomicDNA is amplified in the presence of a primer to a linker sequence and aprimer specific to the known region. The amplified sequences are thensubjected to a second round of PCR with the same linker primer andanother specific primer internal to the first one. Products of eachround of PCR are transcribed with an appropriate RNA polymerase andsequenced using reverse transcriptase.

In another embodiment, inverse PCR can be used to amplify or extendsequences using divergent primers based on a known region (Triglia etal., Nucleic Acids Res., 16:8186 [1988]). The primers may be designedusing Oligo 4.0 (National Biosciences Inc, Plymouth Minn.), or anotherappropriate program, to be 22-30 nucleotides in length, to have a GCcontent of 50% or more, and to anneal to the target sequence attemperatures about 68-72° C. The method uses several restriction enzymesto generate a suitable fragment in the known region of a gene. Thefragment is then circularized by intramolecular ligation and used as aPCR template. In still other embodiments, walking PCR is utilized.Walking PCR is a method for targeted gene walking that permits retrievalof unknown sequence (Parker et al., Nucleic Acids Res., 19:3055-60[1991]). The PROMOTERFINDER kit (Clontech) uses PCR, nested primers andspecial libraries to “walk in” genomic DNA. This process avoids the needto screen libraries and is useful in finding intron/exon junctions.

Preferred libraries for screening for full length cDNAs includemammalian libraries that have been size-selected to include largercDNAs. Also, random primed libraries are preferred, in that they willcontain more sequences that contain the 5′ and upstream gene regions. Arandomly primed library may be particularly useful in case where anoligo d(T) library does not yield full-length cDNA. Genomic mammalianlibraries are useful for obtaining introns and extending 5′ sequence.

In other embodiments of the present invention, variants of the disclosedNOD sequences are provided. In preferred embodiments, variants resultfrom polymorphisms or mutations (i.e., a change in the nucleic acidsequence) and generally produce altered mRNAs or polypeptides whosestructure or function may or may not be altered. Any given gene may havenone, one, or many variant forms. Common mutational changes that giverise to variants are generally ascribed to deletions, additions orsubstitutions of nucleic acids. Each of these types of changes may occuralone, or in combination with the others, and at the rate of one or moretimes in a given sequence.

It is contemplated that it is possible to modify the structure of apeptide having a function (e.g., NOD function) for such purposes asaltering the biological activity (e.g., Nod signaling). Such modifiedpeptides are considered functional equivalents of peptides having anactivity of a NOD peptide as defined herein. A modified peptide can beproduced in which the nucleotide sequence encoding the polypeptide hasbeen altered, such as by substitution, deletion, or addition. Inparticularly preferred embodiments, these modifications do notsignificantly reduce the biological activity of the modified NOD genes.In other words, construct “X” can be evaluated in order to determinewhether it is a member of the genus of modified or variant NOD's of thepresent invention as defined functionally, rather than structurally. Inpreferred embodiments, the activity of variant NOD polypeptides isevaluated by methods described herein (e.g., the generation oftransgenic animals or the use of signaling assays).

Moreover, as described above, variant forms of NOD genes are alsocontemplated as being equivalent to those peptides and DNA moleculesthat are set forth in more detail herein. For example, it iscontemplated that isolated replacement of a leucine with an isoleucineor valine, an aspartate with a glutamate, a threonine with a serine, ora similar replacement of an amino acid with a structurally related aminoacid (i.e., conservative mutations) will not have a major effect on thebiological activity of the resulting molecule. Accordingly, someembodiments of the present invention provide variants of NOD disclosedherein containing conservative replacements. Conservative replacementsare those that take place within a family of amino acids that arerelated in their side chains. Genetically encoded amino acids can bedivided into four families: (1) acidic (aspartate, glutamate); (2) basic(lysine, arginine, histidine); (3) nonpolar (alanine, valine, leucine,isoleucine, proline, phenylalanine, methionine, tryptophan); and (4)uncharged polar (glycine, asparagine, glutamine, cysteine, serine,threonine, tyrosine). Phenylalanine, tryptophan, and tyrosine aresometimes classified jointly as aromatic amino acids. In similarfashion, the amino acid repertoire can be grouped as (1) acidic(aspartate, glutamate); (2) basic (lysine, arginine, histidine), (3)aliphatic (glycine, alanine, valine, leucine, isoleucine, serine,threonine), with serine and threonine optionally be grouped separatelyas aliphatic-hydroxyl; (4) aromatic (phenylalanine, tyrosine,tryptophan); (5) amide (asparagine, glutamine); and (6) sulfur-containing (cysteine and methionine) (e.g., Stryer ed., Biochemistry,pg. 17-21, 2nd ed, WH Freeman and Co., 1981). Whether a change in theamino acid sequence of a peptide results in a functional polypeptide canbe readily determined by assessing the ability of the variant peptide tofunction in a fashion similar to the wild-type protein. Peptides havingmore than one replacement can readily be tested in the same manner.

More rarely, a variant includes “nonconservative” changes (e.g.,replacement of a glycine with a tryptophan). Analogous minor variationscan also include amino acid deletions or insertions, or both. Guidancein determining which amino acid residues can be substituted, inserted,or deleted without abolishing biological activity can be found usingcomputer programs (e.g., LASERGENE software, DNASTAR Inc., Madison,Wis.).

As described in more detail below, variants may be produced by methodssuch as directed evolution or other techniques for producingcombinatorial libraries of variants, described in more detail below. Instill other embodiments of the present invention, the nucleotidesequences of the present invention may be engineered in order to alter aNOD coding sequence including, but not limited to, alterations thatmodify the cloning, processing, localization, secretion, and/orexpression of the gene product. For example, mutations may be introducedusing techniques that are well known in the art (e.g., site-directedmutagenesis to insert new restriction sites, alter glycosylationpatterns, or change codon preference, etc.). TABLE 1 Nod Genes Nod GeneSEQ ID NO (Nucleic acid) SEQ ID NO (Polypeptide) Nod3 1 12 Nod5 2 13Nod6 3 14 Nod8 4 15 Nod9 5 16 Nod12 6 17 Nod14 7 18 Nod16 8 19 Nod17 920 Nod26 10 21 Nod27 11 22II. NOD Polypeptides

In other embodiments, the present invention provides NOD polynucleotidesequences that encode NOD polypeptide sequences (e.g., the polypeptidesof SEQ ID NOs: 12-22). Other embodiments of the present inventionprovide fragments, fusion proteins or functional equivalents of theseNOD proteins. In some embodiments, the present invention providesmutants of NOD polypeptides. In still other embodiments of the presentinvention, nucleic acid sequences corresponding to NOD variants,homologs, and mutants may be used to generate recombinant DNA moleculesthat direct the expression of the NOD variants, homologs, and mutants inappropriate host cells. In some embodiments of the present invention,the polypeptide may be a naturally purified product, in otherembodiments it may be a product of chemical synthetic procedures, and instill other embodiments it may be produced by recombinant techniquesusing a prokaryotic or eukaryotic host (e.g., by bacterial, yeast,higher plant, insect and mammalian cells in culture). In someembodiments, depending upon the host employed in a recombinantproduction procedure, the polypeptide of the present invention may beglycosylated or may be non-glycosylated. In other embodiments, thepolypeptides of the invention may also include an initial methionineamino acid residue.

In one embodiment of the present invention, due to the inherentdegeneracy of the genetic code, DNA sequences other than thepolynucleotide sequences of SEQ ID NOs: 1-11 that encode substantiallythe same or a functionally equivalent amino acid sequence, may be usedto clone and express NOD. In general, such polynucleotide sequenceshybridize to SEQ ID NOs: 1-11 under conditions of high to mediumstringency as described above. As will be understood by those of skillin the art, it may be advantageous to produce NOD-encoding nucleotidesequences possessing non-naturally occurring codons. Therefore, in somepreferred embodiments, codons preferred by a particular prokaryotic oreukaryotic host (Murray et al., Nucl. Acids Res., 17 [1989]) areselected, for example, to increase the rate of NOD expression or toproduce recombinant RNA transcripts having desirable properties, such asa longer half-life, than transcripts produced from naturally occurringsequence.

1. Vectors for Production of NOD

The polynucleotides of the present invention may be employed forproducing polypeptides by recombinant techniques. Thus, for example, thepolynucleotide may be included in any one of a variety of expressionvectors for expressing a polypeptide. In some embodiments of the presentinvention, vectors include, but are not limited to, chromosomal,nonchromosomal and synthetic DNA sequences (e.g., derivatives of SV40,bacterial plasmids, phage DNA; baculovirus, yeast plasmids, vectorsderived from combinations of plasmids and phage DNA, and viral DNA suchas vaccinia, adenovirus, fowl pox virus, and pseudorabies). It iscontemplated that any vector may be used as long as it is replicable andviable in the host.

In particular, some embodiments of the present invention providerecombinant constructs comprising one or more of the sequences asbroadly described above (e.g., SEQ ID NOs: 1-11). In some embodiments ofthe present invention, the constructs comprise a vector, such as aplasmid or viral vector, into which a sequence of the invention has beeninserted, in a forward or reverse orientation. In still otherembodiments, the heterologous structural sequence (e.g., SEQ ID NOs:1-11) is assembled in appropriate phase with translation initiation andtermination sequences. In preferred embodiments of the presentinvention, the appropriate DNA sequence is inserted into the vectorusing any of a variety of procedures. In general, the DNA sequence isinserted into an appropriate restriction endonuclease site(s) byprocedures known in the art.

Large numbers of suitable vectors are known to those of skill in theart, and are commercially available. Such vectors include, but are notlimited to, the following vectors: 1) Bacterial—pQE70, pQE60, pQE-9(Qiagen), pBS, pD10, phagescript, psiX174, pbluescript SK, pBSKS, pNH8A,pNH16a, pNH18A, pNH46A (Stratagene); ptrc99a, pKK223-3, pKK233-3,pDR540, pRIT5 (Pharmacia); 2) Eukaryotic—pWLNEO, pSV2CAT, pOG44, PXT1,pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia); and 3)Baculovirus—pPbac and pMbac (Stratagene). Any other plasmid or vectormay be used as long as they are replicable and viable in the host. Insome preferred embodiments of the present invention, mammalianexpression vectors comprise an origin of replication, a suitablepromoter and enhancer, and also any necessary ribosome binding sites,polyadenylation sites, splice donor and acceptor sites, transcriptionaltermination sequences, and 5′ flanking non-transcribed sequences. Inother embodiments, DNA sequences derived from the SV40 splice, andpolyadenylation sites may be used to provide the requirednon-transcribed genetic elements.

In certain embodiments of the present invention, the DNA sequence in theexpression vector is operatively linked to an appropriate expressioncontrol sequence(s) promoter) to direct mRNA synthesis. Promoters usefulin the present invention include, but are not limited to, the LTR orSV40 promoter, the E. coli lac or trp, the phage lambda P_(L) and P_(R),T3 and T7 promoters, and the cytomegalovirus (CMV) immediate early,herpes simplex virus (HSV) thymidine kinase, and mouse metallothionein-Ipromoters and other promoters known to control expression of genes inprokaryotic or eukaryotic cells or their viruses. In other embodimentsof the present invention, recombinant expression vectors include originsof replication and selectable markers permitting transformation of thehost cell (e.g., dihydrofolate reductase or neomycin resistance foreukaryotic cell culture, or tetracycline or ampicillin resistance in E.coli).

In some embodiments of the present invention, transcription of the DNAencoding the polypeptides of the present invention by higher eukaryotesis increased by inserting an enhancer sequence into the vector.Enhancers are cis-acting elements of DNA, usually about from 10 to 300bp that act on a promoter to increase its transcription. Enhancersuseful in the present invention include, but are not limited to, theSV40 enhancer on the late side of the replication origin bp 100 to 270,a cytomegalovirus early promoter enhancer, the polyoma enhancer on thelate side of the replication origin, and adenovirus enhancers.

In other embodiments, the expression vector also contains a ribosomebinding site for translation initiation and a transcription terminator.In still other embodiments of the present invention, the vector may alsoinclude appropriate sequences for amplifying expression.

2. Host Cells for Production of NOD Polypeptides

In a further embodiment, the present invention provides host cellscontaining the above-described constructs. In some embodiments of thepresent invention, the host cell is a higher eukaryotic cell (e.g., amammalian or insect cell). In other embodiments of the presentinvention, the host cell is a lower eukaryotic cell (e.g., a yeastcell). In still other embodiments of the present invention, the hostcell can be a prokaryotic cell (e.g., a bacterial cell). Specificexamples of host cells include, but are not limited to, Escherichiacoli, Salmonella typhimurium, Bacillus subtilis, and various specieswithin the genera Pseudomonas, Streptomyces, and Staphylococcus, as wellas Saccharomycees cerivisiae, Schizosaccharomycees pombe, Drosophila S2cells, Spodoptera Sf9 cells, Chinese hamster ovary (CHO) cells, COS-7lines of monkey kidney fibroblasts, (Gluzman, Cell 23:175 [1981]), C127,3T3, 293, 293T, HeLa and BHK cell lines.

The constructs in host cells can be used in a conventional manner toproduce the gene product encoded by the recombinant sequence. In someembodiments, introduction of the construct into the host cell can beaccomplished by calcium phosphate transfection, DEAE-Dextran mediatedtransfection, or electroporation (See e.g., Davis et al., Basic Methodsin Molecular Biology, [1986]). Alternatively, in some embodiments of thepresent invention, the polypeptides of the invention can besynthetically produced by conventional peptide synthesizers.

Proteins can be expressed in mammalian cells, yeast, bacteria, or othercells under the control of appropriate promoters. Cell-free translationsystems can also be employed to produce such proteins using RNAs derivedfrom the DNA constructs of the present invention. Appropriate cloningand expression vectors for use with prokaryotic and eukaryotic hosts aredescribed by Sambrook, et al., Molecular Cloning: A Laboratory Manual,Second Edition, Cold Spring Harbor, N.Y., [1989].

In some embodiments of the present invention, following transformationof a suitable host strain and growth of the host strain to anappropriate cell density, the selected promoter is induced byappropriate means (e.g., temperature shift or chemical induction) andcells are cultured for an additional period. In other embodiments of thepresent invention, cells are typically harvested by centrifugation,disrupted by physical or chemical means, and the resulting crude extractretained for further purification. In still other embodiments of thepresent invention, microbial cells employed in expression of proteinscan be disrupted by any convenient method, including freeze-thawcycling, sonication, mechanical disruption, or use of cell lysingagents.

3. Purification of NOD polypeptides

The present invention also provides methods for recovering and purifyingNOD polypeptides from recombinant cell cultures including, but notlimited to, ammonium sulfate or ethanol precipitation, acid extraction,anion or cation exchange chromatography, phosphocellulosechromatography, hydrophobic interaction chromatography, affinitychromatography, hydroxylapatite chromatography and lectinchromatography. In other embodiments of the present invention,protein-refolding steps can be used as necessary, in completingconfiguration of the mature protein. In still other embodiments of thepresent invention, high performance liquid chromatography (HPLC) can beemployed for final purification steps.

The present invention further provides polynucleotides having a codingsequence of a NOD gene (e.g., SEQ ID NOs: 1-11) fused in frame to amarker sequence that allows for purification of the polypeptide of thepresent invention. A non-limiting example of a marker sequence is ahexahistidine tag which may be supplied by a vector, preferably a pQE-9vector, which provides for purification of the polypeptide fused to themarker in the case of a bacterial host, or, for example, the markersequence may be a hemagglutinin (HA) tag when a mammalian host (e.g.,COS-7 cells) is used. The HA tag corresponds to an epitope derived fromthe influenza hemagglutinin protein (Wilson et al., Cell, 37:767[1984]).

4. Truncation Mutants of NOD Polypeptide

In addition, the present invention provides fragments of NODpolypeptides (i.e., truncation mutants). In some embodiments of thepresent invention, when expression of a portion of the NOD protein isdesired, it may be necessary to add a start codon (ATG) to theoligonucleotide fragment containing the desired sequence to beexpressed. It is well known in the art that a methionine at theN-terminal position can be enzymatically cleaved by the use of theenzyme methionine aminopeptidase (MAP). MAP has been cloned from E. coli(Ben-Bassat et al., J. Bacteriol., 169:751 [1987]) and Salmonellatyphimurium and its in vitro activity has been demonstrated onrecombinant proteins (Miller et al., Proc. Natl. Acad. Sci. USA 84:2718[1990]). Therefore, removal of an N-terminal methionine, if desired, canbe achieved either in vivo by expressing such recombinant polypeptidesin a host which produces MAP (e.g., E. coli or CM89 or S. cerivisiae),or in vitro by use of purified MAP.

5. Fusion Proteins Containing NOD

The present invention also provides fusion proteins incorporating all orpart of the NOD polypeptides of the present invention. Accordingly, insome embodiments of the present invention, the coding sequences for thepolypeptide can be incorporated as a part of a fusion gene including anucleotide sequence encoding a different polypeptide. It is contemplatedthat this type of expression system will find use under conditions whereit is desirable to produce an immunogenic fragment of a NOD protein. Insome embodiments of the present invention, the VP6 capsid protein ofrotavirus is used as an immunologic carrier protein for portions of aNOD polypeptide, either in the monomeric form or in the form of a viralparticle. In other embodiments of the present invention, the nucleicacid sequences corresponding to the portion of a NOD polypeptide againstwhich antibodies are to be raised can be incorporated into a fusion geneconstruct which includes coding sequences for a late vaccinia virusstructural protein to produce a set of recombinant viruses expressingfusion proteins comprising a portion of NOD as part of the virion. Ithas been demonstrated with the use of immunogenic fusion proteinsutilizing the hepatitis B surface antigen fusion proteins thatrecombinant hepatitis B virions can be utilized in this role as well.Similarly, in other embodiments of the present invention, chimericconstructs coding for fusion proteins containing a portion of a NODpolypeptide and the poliovirus capsid protein are created to enhanceimmunogenicity of the set of polypeptide antigens (See e.g., EPPublication No. 025949; and Evans et al., Nature 339:385 [1989]; Huanget al., J. Virol., 62:3855 [1988]; and Schlienger et al., J. Virol.,66:2 [1992]).

In still other embodiments of the present invention, the multipleantigen peptide system for peptide-based immunization can be utilized.In this system, a desired portion of NOD is obtained directly fromorgano-chemical synthesis of the peptide onto an oligomeric branchinglysine core (see e.g., Posnett et al., J. Biol. Chem., 263:1719 [1988];and Nardelli et al., J. Immunol., 148:914 [1992]). In other embodimentsof the present invention, antigenic determinants of the NOD proteins canalso be expressed and presented by bacterial cells.

In addition to utilizing fusion proteins to enhance immunogenicity, itis widely appreciated that fusion proteins can also facilitate theexpression of proteins, such as a NOD protein of the present invention.Accordingly, in some embodiments of the present invention, NODpolypeptides can be generated as glutathione-S-transferase (i.e., GSTfusion proteins). It is contemplated that such GST fusion proteins willenable easy purification of NOD polypeptides, such as by the use ofglutathione-derivatized matrices (See e.g., Ausabel et al. (eds.),Current Protocols in Molecular Biology, John Wiley & Sons, NY [1991]).In another embodiment of the present invention, a fusion gene coding fora purification leader sequence, such as a poly-(His)/enterokinasecleavage site sequence at the N-terminus of the desired portion of a NODpolypeptide, can allow purification of the expressed NOD fusion proteinby affinity chromatography using a Ni²⁺ metal resin. In still anotherembodiment of the present invention, the purification leader sequencecan then be subsequently removed by treatment with enterokinase (Seee.g., Hochuli et al., J. Chromatogr., 411:177 [1987]; and Janknecht etal., Proc. Natl. Acad. Sci. USA 88:8972).

Techniques for making fusion genes are well known. Essentially, thejoining of various DNA fragments coding for different polypeptidesequences is performed in accordance with conventional techniques,employing blunt-ended or stagger-ended termini for ligation, restrictionenzyme digestion to provide for appropriate termini, filling-in ofcohesive ends as appropriate, alkaline phosphatase treatment to avoidundesirable joining, and enzymatic ligation. In another embodiment ofthe present invention, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, in other embodiments of the present invention, PCRamplification of gene fragments can be carried out using anchor primerswhich give rise to complementary overhangs between two consecutive genefragments which can subsequently be annealed to generate a chimeric genesequence (See e.g., Current Protocols in Molecular Biology, supra).

6. Variants of NOD

Still other embodiments of the present invention provide mutant orvariant forms of NOD polypeptides (i.e., muteins). It is possible tomodify the structure of a peptide having an activity of a NODpolypeptide of the present invention for such purposes as enhancingtherapeutic or prophylactic efficacy, or stability (e.g., ex vivo shelflife, and/or resistance to proteolytic degradation in vivo). Suchmodified peptides are considered functional equivalents of peptideshaving an activity of the subject NOD proteins as defined herein. Amodified peptide can be produced in which the amino acid sequence hasbeen altered, such as by amino acid substitution, deletion, or addition.

Moreover, as described above, variant forms (e.g., mutants orpolymorphic sequences) of the subject NOD proteins are also contemplatedas being equivalent to those peptides and DNA molecules that are setforth in more detail. For example, as described above, the presentinvention encompasses mutant and variant proteins that containconservative or non-conservative amino acid substitutions.

This invention further contemplates a method of generating sets ofcombinatorial mutants of the present NOD proteins, as well as truncationmutants, and is especially useful for identifying potential variantsequences (i.e., mutants or polymorphic sequences) that are involved ininflammatory diseases or resistance to inflammatory diseases. Thepurpose of screening such combinatorial libraries is to generate, forexample, novel NOD variants that can act as either agonists orantagonists, or alternatively, possess novel activities all together.

Therefore, in some embodiments of the present invention, NOD variantsare engineered by the present method to provide altered (e.g., increasedor decreased) biological activity. In other embodiments of the presentinvention, combinatorially-derived variants are generated which have aselective potency relative to a naturally occurring NOD. Such proteins,when expressed from recombinant DNA constructs, can be used in genetherapy protocols.

Still other embodiments of the present invention provide NOD variantsthat have intracellular half-lives dramatically different than thecorresponding wild-type protein. For example, the altered protein can berendered either more stable or less stable to proteolytic degradation orother cellular process that result in destruction of, or otherwiseinactivate NOD polypeptides. Such variants, and the genes which encodethem, can be utilized to alter the location of NOD expression bymodulating the half-life of the protein. For instance, a short half-lifecan give rise to more transient NOD biological effects and, when part ofan inducible expression system, can allow tighter control of NOD levelswithin the cell. As above, such proteins, and particularly theirrecombinant nucleic acid constructs, can be used in gene therapyprotocols.

In still other embodiments of the present invention, NOD variants aregenerated by the combinatorial approach to act as antagonists, in thatthey are able to interfere with the ability of the correspondingwild-type protein to regulate cell function.

In some embodiments of the combinatorial mutagenesis approach of thepresent invention, the amino acid sequences for a population of NODhomologs, variants or other related proteins are aligned, preferably topromote the highest homology possible. Such a population of variants caninclude, for example, NOD homologs from one or more species, or NODvariants from the same species but which differ due to mutation orpolymorphisms. Amino acids that appear at each position of the alignedsequences are selected to create a degenerate set of combinatorialsequences.

In a preferred embodiment of the present invention, the combinatorialNOD library is produced by way of a degenerate library of genes encodinga library of polypeptides which each include at least a portion ofpotential NOD protein sequences. For example, a mixture of syntheticoligonucleotides can be enzymatically ligated into gene sequences suchthat the degenerate set of potential NOD sequences are expressible asindividual polypeptides, or alternatively, as a set of larger fusionproteins (e.g., for phage display) containing the set of NOD sequencestherein.

There are many ways by which the library of potential NOD homologs andvariants can be generated from a degenerate oligonucleotide sequence. Insome embodiments, chemical synthesis of a degenerate gene sequence iscarried out in an automatic DNA synthesizer, and the synthetic genes areligated into an appropriate gene for expression. The purpose of adegenerate set of genes is to provide, in one mixture, all of thesequences encoding the desired set of potential NOD sequences. Thesynthesis of degenerate oligonucleotides is well known in the art (Seee.g., Narang, Tetrahedron Lett., 39:39 [1983]; Itakura et al.,Recombinant DNA, in Walton (ed.), Proceedings of the 3rd ClevelandSymposium on Macromolecules, Elsevier, Amsterdam, pp 273-289 [1981];Itakura et al., Annu. Rev. Biochem., 53:323 [1984]; Itakura et al.,Science 198:1056 [1984]; Ike et al., Nucl. Acid Res., 11:477 [1983]).Such techniques have been employed in the directed evolution of otherproteins (See e.g., Scott et al., Science 249:386 [1980]; Roberts etal., Proc. Natl. Acad. Sci. USA 89:2429 [1992]; Devlin et al., Science249: 404 [1990]; Cwirla et al., Proc. Natl. Acad. Sci. USA 87: 6378[1990]; each of which is herein incorporated by reference; as well asU.S. Pat. Nos. 5,223,409, 5,198,346, and 5,096,815; each of which isincorporated herein by reference).

It is contemplated that the NOD nucleic acids of the present invention(e.g., SEQ ID NOs: 1-11, and fragments and variants thereof) can beutilized as starting nucleic acids for directed evolution. Thesetechniques can be utilized to develop NOD variants having desirableproperties such as increased or decreased biological activity.

In some embodiments, artificial evolution is performed by randommutagenesis (e.g., by utilizing error-prone PCR to introduce randommutations into a given coding sequence). This method requires that thefrequency of mutation be finely tuned. As a general rule, beneficialmutations are rare, while deleterious mutations are common. This isbecause the combination of a deleterious mutation and a beneficialmutation often results in an inactive enzyme. The ideal number of basesubstitutions for targeted gene is usually between 1.5 and 5 (Moore andArnold, Nat. Biotech., 14, 458 [1996]; Leung et al., Technique, 1:11[1989]; Eckert and Kunkel, PCR Methods Appl., 1: 17-24 [1991]; Caldwelland Joyce, PCR Methods Appl., 2:28 [1992]; and Zhao and Arnold, Nuc.Acids. Res., 25:1307 [1997]). After mutagenesis, the resulting clonesare selected for desirable activity (e.g., screened for NOD activity).Successive rounds of mutagenesis and selection are often necessary todevelop enzymes with desirable properties. It should be noted that onlythe useful mutations are carried over to the next round of mutagenesis.

In other embodiments of the present invention, the polynucleotides ofthe present invention are used in gene shuffling or sexual PCRprocedures (e.g., Smith, Nature, 370:324 [1994]; U.S. Pat. Nos.5,837,458; 5,830,721; 5,811,238; 5,733,731; all of which are hereinincorporated by reference). Gene shuffling involves random fragmentationof several mutant DNAs followed by their reassembly by PCR into fulllength molecules. Examples of various gene shuffling procedures include,but are not limited to, assembly following DNase treatment, thestaggered extension process (STEP), and random priming in vitrorecombination. In the DNase mediated method, DNA segments isolated froma pool of positive mutants are cleaved into random fragments with DNaseIand subjected to multiple rounds of PCR with no added primer. Thelengths of random fragments approach that of the uncleaved segment asthe PCR cycles proceed, resulting in mutations in present in differentclones becoming mixed and accumulating in some of the resultingsequences. Multiple cycles of selection and shuffling have led to thefunctional enhancement of several enzymes (Stemmer, Nature, 370:398[1994]; Stemmer, Proc. Natl. Acad. Sci. USA, 91:10747 [1994]; Crameri etal., Nat. Biotech., 14:315 [1996]; Zhang et al., Proc. Natl. Acad. Sci.USA, 94:4504 [1997]; and Crameri et al., Nat. Biotech., 15:436 [1997]).Variants produced by directed evolution can be screened for NOD activityby the methods described herein.

A wide range of techniques are known in the art for screening geneproducts of combinatorial libraries made by point mutations, and forscreening cDNA libraries for gene products having a certain property.Such techniques will be generally adaptable for rapid screening of thegene libraries generated by the combinatorial mutagenesis orrecombination of NOD homologs or variants. The most widely usedtechniques for screening large gene libraries typically comprisescloning the gene library into replicable expression vectors,transforming appropriate cells with the resulting library of vectors,and expressing the combinatorial genes under conditions in whichdetection of a desired activity facilitates relatively easy isolation ofthe vector encoding the gene whose product was detected.

7. Chemical Synthesis of NOD Polypeptides

In an alternate embodiment of the invention, the coding sequence of NODis synthesized, whole or in part, using chemical methods well known inthe art (See e.g., Caruthers et al., Nucl. Acids Res. Symp. Ser., 7:215[1980]; Crea and Horn, Nucl. Acids Res., 9:2331 [1980]; Matteucci andCaruthers, Tetrahedron Lett., 21:719 [1980]; and Chow and Kempe, Nucl.Acids Res., 9:2807 [1981]). In other embodiments of the presentinvention, the protein itself is produced using chemical methods tosynthesize either an entire NOD amino acid sequence or a portionthereof. For example, peptides can be synthesized by solid phasetechniques, cleaved from the resin, and purified by preparative highperformance liquid chromatography (See e.g., Creighton, ProteinsStructures And Molecular Principles, W H Freeman and Co, New York N.Y.[1983]). In other embodiments of the present invention, the compositionof the synthetic peptides is confirmed by amino acid analysis orsequencing (See e.g., Creighton, supra).

Direct peptide synthesis can be performed using various solid-phasetechniques (Roberge et al., Science 269:202 [1995]) and automatedsynthesis may be achieved, for example, using ABI 431A PeptideSynthesizer (Perkin Elmer) in accordance with the instructions providedby the manufacturer. Additionally, the amino acid sequence of a NODpolypeptide, or any part thereof, may be altered during direct synthesisand/or combined using chemical methods with other sequences to produce avariant polypeptide.

III. Detection of NOD Alleles

In some embodiments, the present invention provides methods of detectingthe presence of wild type or variant (e.g., mutant or polymorphic) NODnucleic acids or polypeptides. The detection of mutant NOD polypeptidesfinds use in the diagnosis of disease (e.g., inflammatory disease).

A. Detection of Variant NOD Alleles

In some embodiments, the present invention provides alleles of NOD thatincrease a patient's susceptibility to inflammatory diseases. Anymutation that results in an altered phenotype (e.g., increase ininflammatory disease or resistance to inflammatory disease) is withinthe scope of the present invention.

Accordingly, the present invention provides methods for determiningwhether a patient has an increased susceptibility to an inflammatorydisease by determining whether the individual has a variant NOD allele.In other embodiments, the present invention provides methods forproviding a prognosis of increased risk for inflammatory disease to anindividual based on the presence or absence of one or more variantalleles of NOD.

A number of methods are available for analysis of variant (e.g., mutantor polymorphic) nucleic acid sequences. Assays for detection variants(e.g., polymorphisms or mutations) fall into several categoriesincluding, but not limited to, direct sequencing assays, fragmentpolymorphism assays, hybridization assays, and computer based dataanalysis. Protocols and commercially available kits or services forperforming multiple variations of these assays are available. In someembodiments, assays are performed in combination or in hybrid (e.g.,different reagents or technologies from several assays are combined toyield one assay). The following exemplary assays are useful in thepresent invention: directs sequencing assays, PCR assays, mutationalanalysis by dHPLC (e.g., available from Transgenomic, Omaha, Nebr. orVarian, Palo Alto, Calif.), fragment length polymorphism assays (e.g.,RFLP or CFLP (See e.g. U.S. Patents U.S. Pat. Nos. 5,843,654; 5,843,669;5,719,208; and 5,888,780; each of which is herein incorporated byreference)), hybridization assays (e.g., direct detection ofhybridization, detection of hybridization using DNA chip assays (Seee.g., U.S. Pat. Nos. 6,045,996; 5,925,525; 5,858,659; 6,017,696;6,068,818; 6,051,380; 6,001,311; 5,985,551; 5,474,796; PCT PublicationsWO 99/67641 and WO 00/39587, each of which is herein incorporated byreference), enzymatic detection of hybridization (See e.g., U.S. Pat.Nos. 5,846,717, 6,090,543; 6,001,567; 5,985,557; 5,994,069; 5,962,233;5,538,848; 5,952,174 and 5,919,626, each of which is herein incorporatedby reference)), and mass spectrometry assays. In addition, assays forthe detection of variant NOD proteins find use in the present invention(e.g., cell free translation methods, See e.g., U.S. Pat. No. 6,303,337,herein incorporated by reference) and antibody binding assays.

B. Kits for Analyzing Risk of Inflammatory Disease

The present invention also provides kits for determining whether anindividual contains a wild-type or variant (e.g., mutant or polymorphic)allele or polypeptide of NOD. In some embodiments, the kits are usefuldetermining whether the subject is at risk of developing an inflammatorydisease (e.g., Crohn's disease or psoriasis). The diagnostic kits areproduced in a variety of ways. In some embodiments, the kits contain atleast one reagent for specifically detecting a mutant NOD allele orprotein. In preferred embodiments, the reagent is a nucleic acid thathybridizes to nucleic acids containing the mutation and that does notbind to nucleic acids that do not contain the mutation. In otherembodiments, the reagents are primers for amplifying the region of DNAcontaining the mutation. In still other embodiments, the reagents areantibodies that preferentially bind either the wild-type or mutant NODproteins.

In some embodiments, the kit contains instructions for determiningwhether the subject is at risk for an inflammatory disease. In preferredembodiments, the instructions specify that risk for developing aninflammatory disease is determined by detecting the presence or absenceof a mutant NOD allele in the subject, wherein subjects having an mutantallele are at greater risk for developing an inflammatory disease.

The presence or absence of a disease-associated mutation in a NOD genecan be used to make therapeutic or other medical decisions. For example,couples with a family history of inflammatory diseases may choose toconceive a child via in vitro fertilization and pre-implantation geneticscreening. In this case, fertilized embryos are screened for mutant(e.g., disease associated) alleles of a NOD gene and only embryos withwild type alleles are implanted in the uterus.

In other embodiments, in utero screening is performed on a developingfetus (e.g., amniocentesis or chorionic villi screening). In still otherembodiments, genetic screening of newborn babies or very young childrenis performed. The early detection of a NOD allele known to be associatedwith an inflammatory disease allows for early intervention (e.g.,genetic or pharmaceutical therapies).

In some embodiments, the kits include ancillary reagents such asbuffering agents, nucleic acid stabilizing reagents, protein stabilizingreagents, and signal producing systems (e.g., florescence generatingsystems as Fret systems). The test kit may be packaged in any suitablemanner, typically with the elements in a single container or variouscontainers as necessary along with a sheet of instructions for carryingout the test. In some embodiments, the kits also preferably include apositive control sample.

C. Bioinformatics

In some embodiments, the present invention provides methods ofdetermining an individual's risk of developing an inflammatory diseasebased on the presence of one or more variant alleles of a NOD gene. Insome embodiments, the analysis of variant data is processed by acomputer using information stored on a computer (e.g., in a database).For example, in some embodiments, the present invention provides abioinformatics research system comprising a plurality of computersrunning a multi-platform object oriented programming language (See e.g.,U.S. Pat. No. 6,125,383; herein incorporated by reference). In someembodiments, one of the computers stores genetics data (e.g., the riskof contacting an inflammatory disease associated with a givenpolymorphism, as well as the sequences). In some embodiments, one of thecomputers stores application programs (e.g., for analyzing the resultsof detection assays). Results are then delivered to the user (e.g., viaone of the computers or via the internet.

For example, in some embodiments, a computer-based analysis program isused to translate the raw data generated by the detection assay (e.g.,the presence, absence, or amount of a given NOD allele or polypeptide)into data of predictive value for a clinician. The clinician can accessthe predictive data using any suitable means. Thus, in some preferredembodiments, the present invention provides the further benefit that theclinician, who is not likely to be trained in genetics or molecularbiology, need not understand the raw data. The data is presenteddirectly to the clinician in its most useful form. The clinician is thenable to immediately utilize the information in order to optimize thecare of the subject.

The present invention contemplates any method capable of receiving,processing, and transmitting the information to and from laboratoriesconducting the assays, information providers, medical personal, andsubjects. For example, in some embodiments of the present invention, asample (e.g., a biopsy or a serum or urine sample) is obtained from asubject and submitted to a profiling service (e.g., clinical lab at amedical facility, genomic profiling business, etc.), located in any partof the world (e.g., in a country different than the country where thesubject resides or where the information is ultimately used) to generateraw data. Where the sample comprises a tissue or other biologicalsample, the subject may visit a medical center to have the sampleobtained and sent to the profiling center, or subjects may collect thesample themselves (e.g., a urine sample) and directly send it to aprofiling center. Where the sample comprises previously determinedbiological information, the information may be directly sent to theprofiling service by the subject (e.g., an information card containingthe information may be scanned by a computer and the data transmitted toa computer of the profiling center using an electronic communicationsystems). Once received by the profiling service, the sample isprocessed and a profile is produced (i.e., presence of wild type ormutant NOD genes or polypeptides), specific for the diagnostic orprognostic information desired for the subject.

The profile data is then prepared in a format suitable forinterpretation by a treating clinician. For example, rather thanproviding raw data, the prepared format may represent a diagnosis orrisk assessment (e.g., likelihood of developing an inflammatory disease)for the subject, along with recommendations for particular treatmentoptions. The data may be displayed to the clinician by any suitablemethod. For example, in some embodiments, the profiling servicegenerates a report that can be printed for the clinician (e.g., at thepoint of care) or displayed to the clinician on a computer monitor.

In some embodiments, the information is first analyzed at the point ofcare or at a regional facility. The raw data is then sent to a centralprocessing facility for further analysis and/or to convert the raw datato information useful for a clinician or patient. The central processingfacility provides the advantage of privacy (all data is stored in acentral facility with uniform security protocols), speed, and uniformityof data analysis. The central processing facility can then control thefate of the data following treatment of the subject. For example, usingan electronic communication system, the central facility can providedata to the clinician, the subject, or researchers.

In some embodiments, the subject is able to directly access the datausing the electronic communication system. The subject may chose furtherintervention or counseling based on the results. In some embodiments,the data is used for research use. For example, the data may be used tofurther optimize the association of a given NOD allele with inflammatorydiseases.

IV. Generation of NOD Antibodies

The present invention provides isolated antibodies or antibody fragments(e.g., FAB fragments). Antibodies can be generated to allow for thedetection of a NOD protein of the present invention. The antibodies maybe prepared using various immunogens. In one embodiment, the immunogenis a human NOD peptide to generate antibodies that recognize human NOD.Such antibodies include, but are not limited to polyclonal, monoclonal,chimeric, single chain, Fab fragments, Fab expression libraries, orrecombinant (e.g., chimeric, humanized, etc.) antibodies, as long as itcan recognize the protein. Antibodies can be produced by using a proteinof the present invention as the antigen according to a conventionalantibody or antiserum preparation process.

Various procedures known in the art may be used for the production ofpolyclonal antibodies directed against a NOD polypeptide. For theproduction of antibody, various host animals can be immunized byinjection with the peptide corresponding to the NOD epitope includingbut not limited to rabbits, mice, rats, sheep, goats, etc. In apreferred embodiment, the peptide is conjugated to an immunogeniccarrier (e.g., diphtheria toxoid, bovine serum albumin (BSA), or keyholelimpet hemocyanin (KLH)). Various adjuvants may be used to increase theimmunological response, depending on the host species, including but notlimited to Freund's (complete and incomplete), mineral gels (e.g.,aluminum hydroxide), surface active substances (e.g., lysolecithin,pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpethemocyanins, dinitrophenol, and potentially useful human adjuvants suchas BCG (Bacille Calmette-Guerin) and Corynebacterium parvum).

For preparation of monoclonal antibodies directed toward NOD, it iscontemplated that any technique that provides for the production ofantibody molecules by continuous cell lines in culture will find usewith the present invention (See e.g., Harlow and Lane, Antibodies: ALaboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.). These include but are not limited to the hybridomatechnique originally developed by Köhler and Milstein (Köhler andMilstein, Nature 256:495-497 [1975]), as well as the trioma technique,the human B-cell hybridoma technique (See e.g., Kozbor et al., Immunol.Tod., 4:72 [1983]), and the EBV-hybridoma technique to produce humanmonoclonal antibodies (Cole et al., in Monoclonal Antibodies and CancerTherapy, Alan R. Liss, Inc., pp. 77-96 [1985]).

In an additional embodiment of the invention, monoclonal antibodies areproduced in germ-free animals utilizing technology such as thatdescribed in PCT/US90/02545). Furthermore, it is contemplated that humanantibodies will be generated by human hybridomas (Cote et al., Proc.Natl. Acad. Sci. USA 80:2026-2030 [1983]) or by transforming human Bcells with EBV virus in vitro (Cole et al., in Monoclonal Antibodies andCancer Therapy, Alan R. Liss, pp. 77-96 [1985]).

In addition, it is contemplated that techniques described for theproduction of single chain antibodies (U.S. Pat. No. 4,946,778; hereinincorporated by reference) will find use in producing NOD specificsingle chain antibodies. An additional embodiment of the inventionutilizes the techniques described for the construction of Fab expressionlibraries (Huse et al., Science 246:1275-1281 [1989]) to allow rapid andeasy identification of monoclonal Fab fragments with the desiredspecificity for a NOD polypeptide.

In other embodiments, the present invention contemplated recombinantantibodies or fragments thereof to the proteins of the presentinvention. Recombinant antibodies include, but are not limited to,humanized and chimeric antibodies. Methods for generating recombinantantibodies are known in the art (See e.g., U.S. Pat. Nos. 6,180,370 and6,277,969 and “Monoclonal Antibodies” H. Zola, BIOS ScientificPublishers Limited 2000. Springer-Verlay New York, Inc., New York; eachof which is herein incorporated by reference).

It is contemplated that any technique suitable for producing antibodyfragments will find use in generating antibody fragments that containthe idiotype (antigen binding region) of the antibody molecule. Forexample, such fragments include but are not limited to: F(ab′)2 fragmentthat can be produced by pepsin digestion of the antibody molecule; Fab′fragments that can be generated by reducing the disulfide bridges of theF(ab′)2 fragment, and Fab fragments that can be generated by treatingthe antibody molecule with papain and a reducing agent.

In the production of antibodies, it is contemplated that screening forthe desired antibody will be accomplished by techniques known in the art(e.g., radioimmunoassay, ELISA (enzyme-linked immunosorbant assay),“sandwich” immunoassays, immunoradiometric assays, gel diffusionprecipitation reactions, immunodiffusion assays, in situ immunoassays(e.g., using colloidal gold, enzyme or radioisotope labels), Westernblots, precipitation reactions, agglutination assays (e.g., gelagglutination assays, hemagglutination assays, etc.), complementfixation assays, immunofluorescence assays, protein A assays, andimmunoelectrophoresis assays, etc.

In one embodiment, antibody binding is detected by detecting a label onthe primary antibody. In another embodiment, the primary antibody isdetected by detecting binding of a secondary antibody or reagent to theprimary antibody. In a further embodiment, the secondary antibody islabeled. Many means are known in the art for detecting binding in animmunoassay and are within the scope of the present invention. As iswell known in the art, the immunogenic peptide should be provided freeof the carrier molecule used in any immunization protocol. For example,if the peptide was conjugated to KLH, it may be conjugated to BSA, orused directly, in a screening assay.)

The foregoing antibodies can he used in methods known in the artrelating to the localization and structure of NOD (e.g., for Westernblotting), measuring levels thereof in appropriate biological samples,etc. The antibodies can be used to detect a NOD in a biological samplefrom an individual. The biological sample can be a biological fluid,such as, but not limited to, blood, serum, plasma, interstitial fluid,urine, cerebrospinal fluid, and the like, containing cells.

The biological samples can then be tested directly for the presence of ahuman NOD using an appropriate strategy (e.g., ELISA orradioimmunoassay) and format (e.g., microwells, dipstick (e.g., asdescribed in International Patent Publication WO 93/03367), etc.Alternatively, proteins in the sample can be size separated (e.g., bypolyacrylamide gel electrophoresis (PAGE), in the presence or not ofsodium dodecyl sulfate (SDS), and the presence of NOD detected byimmunoblotting (Western blotting). Immunoblotting techniques aregenerally more effective with antibodies generated against a peptidecorresponding to an epitope of a protein, and hence, are particularlysuited to the present invention.

Another method uses antibodies as agents to alter signal transduction.Specific antibodies that bind to the binding domains of NOD or otherproteins involved in intracellular signaling can be used to inhibit theinteraction between the various proteins and their interaction withother ligands. Antibodies that bind to the complex can also be usedtherapeutically to inhibit interactions of the protein complex in thesignal transduction pathways leading to the various physiological andcellular effects of NOD. Such antibodies can also be used diagnosticallyto measure abnormal expression of NOD, or the aberrant formation ofprotein complexes, which may be indicative of a disease state.

V. Gene Therapy Using NOD

The present invention also provides methods and compositions suitablefor gene therapy to alter NOD expression, production, or function. Asdescribed above, the present invention provides human NOD genes andprovides methods of obtaining NOD genes from other species. Thus, themethods described below are generally applicable across many species. Insome embodiments, it is contemplated that the gene therapy is performedby providing a subject with a wild-type allele of a NOD gene (i.e., anallele that does not contain a NOD disease allele (e.g., free of diseasecausing polymorphisms or mutations). Subjects in need of such therapyare identified by the methods described above.

Viral vectors commonly used for in vivo or ex vivo targeting and therapyprocedures are DNA-based vectors and retroviral vectors. Methods forconstructing and using viral vectors are known in the art (See e.g.,Miller and Rosman, BioTech., 7:980-990 [1992]). Preferably, the viralvectors are replication defective, that is, they are unable to replicateautonomously in the target cell. In general, the genome of thereplication defective viral vectors that are used within the scope ofthe present invention lack at least one region that is necessary for thereplication of the virus in the infected cell. These regions can eitherbe eliminated (in whole or in part), or be rendered non-functional byany technique known to a person skilled in the art. These techniquesinclude the total removal, substitution (by other sequences, inparticular by the inserted nucleic acid), partial deletion or additionof one or more bases to an essential (for replication) region. Suchtechniques may be performed in vitro (i.e., on the isolated DNA) or insitu, using the techniques of genetic manipulation or by treatment withmutagenic agents.

Preferably, the replication defective virus retains the sequences of itsgenome that are necessary for encapsidating the viral particles. DNAviral vectors include an attenuated or defective DNA viruses, including,but not limited to, herpes simplex virus (HSV), papillomavirus, EpsteinBarr virus (EBV), adenovirus, adeno-associated virus (AAV), and thelike. Defective viruses, that entirely or almost entirely lack viralgenes, are preferred, as defective virus is not infective afterintroduction into a cell. Use of defective viral vectors allows foradministration to cells in a specific, localized area, without concernthat the vector can infect other cells. Thus, a specific tissue can bespecifically targeted. Examples of particular vectors include, but arenot limited to, a defective herpes virus 1 (HSV1) vector (Kaplitt etal., Mol. Cell. Neurosci., 2:320-330 [1991]), defective herpes virusvector lacking a glycoprotein L gene (See e.g., Patent Publication RD371005 A), or other defective herpes virus vectors (See e.g., WO94/21807; and WO 92/05263); an attenuated adenovirus vector, such as thevector described by Stratford-Perricaudet et al. (J. Clin. Invest.,90:626-630 [1992]; See also, La Salle et al., Science 259:988-990[1993]); and a defective adeno-associated virus vector (Samulski et al.,J. Virol., 61:3096-3101 [1987]; Samulski et al., J. Virol., 63:3822-3828[1989]; and Lebkowski et al., Mol. Cell. Biol., 8:3988-3996 [1988]).

Preferably, for in vivo administration, an appropriate immunosuppressivetreatment is employed in conjunction with the viral vector (e.g.,adenovirus vector), to avoid immuno-deactivation of the viral vector andtransfected cells. For example, immunosuppressive cytokines, such asinterleukin-12 (IL-12), interferon-gamma (IFN-γ), or anti-CD4 antibody,can be administered to block humoral or cellular immune responses to theviral vectors. In addition, it is advantageous to employ a viral vectorthat is engineered to express a minimal number of antigens.

In a preferred embodiment, the vector is an adenovirus vector.Adenoviruses are eukaryotic DNA viruses that can be modified toefficiently deliver a nucleic acid of the invention to a variety of celltypes. Various serotypes of adenovirus exist. Of these serotypes,preference is given, within the scope of the present invention, to type2 or type 5 human adenoviruses (Ad 2 or Ad 5), or adenoviruses of animalorigin (See e.g., WO 94/26914). Those adenoviruses of animal origin thatcan be used within the scope of the present invention includeadenoviruses of canine, bovine, murine (e.g., Mavl, Beard et al.,Virol., 75-81 [1990]), ovine, porcine, avian, and simian (e.g., SAV)origin. Preferably, the adenovirus of animal origin is a canineadenovirus, more preferably a CAV2 adenovirus (e.g. Manhattan or A26/61strain (ATCC VR-800)).

Preferably, the replication defective adenoviral vectors of theinvention comprise the ITRs, an encapsidation sequence and the nucleicacid of interest. Still more preferably, at least the E1 region of theadenoviral vector is non-functional. The deletion in the E1 regionpreferably extends from nucleotides 455 to 3329 in the sequence of theAd5 adenovirus (PvuII-BglII fragment) or 382 to 3446 (HinfII-Sau3Afragment). Other regions may also be modified, in particular the E3region (e.g., WO 95/02697), the E2 region (e.g., WO 94/28938), the E4region (e.g., WO 94/28152, WO 94/12649 and WO 95/02697), or in any ofthe late genes L1-L5.

In a preferred embodiment, the adenoviral vector has a deletion in theE1 region (Ad 1.0). Examples of E1-deleted adenoviruses are disclosed inEP 185,573, the contents of which are incorporated herein by reference.In another preferred embodiment, the adenoviral vector has a deletion inthe E1 and E4 regions (Ad 3.0). Examples of E1/E4-deleted adenovirusesare disclosed in WO 95/02697 and WO 96/22378. In still another preferredembodiment, the adenoviral vector has a deletion in the E1 region intowhich the E4 region and the nucleic acid sequence are inserted.

The replication defective recombinant adenoviruses according to theinvention can be prepared by any technique known to the person skilledin the art (See e.g., Levrero et al., Gene 101:195 [1991]; EP 185 573;and Graham, EMBO J., 3:2917 [1984]). In particular, they can be preparedby homologous recombination between an adenovirus and a plasmid thatcarries, inter alia, the DNA sequence of interest. The homologousrecombination is accomplished following co-transfection of theadenovirus and plasmid into an appropriate cell line. The cell line thatis employed should preferably (i) be transformable by the elements to beused, and (ii) contain the sequences that are able to complement thepart of the genome of the replication defective adenovirus, preferablyin integrated form in order to avoid the risks of recombination.Examples of cell lines that may be used are the human embryonic kidneycell line 293 (Graham et al., J. Gen. Virol., 36:59 [1977]), whichcontains the left-hand portion of the genome of an Ad5 adenovirus (12%)integrated into its genome, and cell lines that are able to complementthe E1 and E4 functions, as described in applications WO 94/26914 and WO95/02697. Recombinant adenoviruses are recovered and purified usingstandard molecular biological techniques that are well known to one ofordinary skill in the art.

The adeno-associated viruses (AAV) are DNA viruses of relatively smallsize that can integrate, in a stable and site-specific manner, into thegenome of the cells that they infect. They are able to infect a widespectrum of cells without inducing any effects on cellular growth,morphology or differentiation, and they do not appear to be involved inhuman pathologies. The AAV genome has been cloned, sequenced andcharacterized. It encompasses approximately 4700 bases and contains aninverted terminal repeat (ITR) region of approximately 145 bases at eachend, which serves as an origin of replication for the virus. Theremainder of the genome is divided into two essential regions that carrythe encapsidation functions: the left-hand part of the genome, thatcontains the rep gene involved in viral replication and expression ofthe viral genes; and the right-hand part of the genome, that containsthe cap gene encoding the capsid proteins of the virus.

The use of vectors derived from the AAVs for transferring genes in vitroand in vivo has been described (See e.g., WO 91/18088; WO 93/09239; U.S.Pat. No. 4,797,368; U.S. Pat. No., 5,139,941; and EP 488 528, all ofwhich are herein incorporated by reference). These publications describevarious AAV-derived constructs in which the rep and/or cap genes aredeleted and replaced by a gene of interest, and the use of theseconstructs for transferring the gene of interest in vitro (into culturedcells) or in vivo (directly into an organism). The replication defectiverecombinant AAVs according to the invention can be prepared byco-transfecting a plasmid containing the nucleic acid sequence ofinterest flanked by two AAV inverted terminal repeat (ITR) regions, anda plasmid carrying the AAV encapsidation genes (rep and cap genes), intoa cell line that is infected with a human helper virus (for example anadenovirus). The AAV recombinants that are produced are then purified bystandard techniques.

In another embodiment, the gene can be introduced in a retroviral vector(e.g., as described in U.S. Pat. Nos. 5,399,346, 4,650,764, 4,980,289and 5,124,263; all of which are herein incorporated by reference; Mannet al., Cell 33:153 [1983]; Markowitz et al., J. Virol., 62:1120 [1988];PCT/US95/14575; EP 453242; EP178220; Bernstein et al. Genet. Eng., 7:235[1985]; McCormick, BioTechnol., 3:689 [1985]; WO 95/07358; and Kuo etal., Blood 82:845 [1993]). The retroviruses are integrating viruses thatinfect dividing cells. The retrovirus genome includes two LTRs, anencapsidation sequence and three coding regions (gag, pol and env). Inrecombinant retroviral vectors, the gag, pol and env genes are generallydeleted, in whole or in part, and replaced with a heterologous nucleicacid sequence of interest. These vectors can be constructed fromdifferent types of retrovirus, such as, HIV, MoMuLV (“murine Moloneyleukemia virus” MSV (“murine Moloney sarcoma virus”), HaSV (“Harveysarcoma virus”); SNV (“spleen necrosis virus”); RSV (“Rous sarcomavirus”) and Friend virus. Defective retroviral vectors are alsodisclosed in WO 95/02697.

In general, in order to construct recombinant retroviruses containing anucleic acid sequence, a plasmid is constructed that contains the LTRs,the encapsidation sequence and the coding sequence. This construct isused to transfect a packaging cell line, which cell line is able tosupply in trans the retroviral functions that are deficient in theplasmid. In general, the packaging cell lines are thus able to expressthe gag, pol and env genes. Such packaging cell lines have beendescribed in the prior art, in particular the cell line PA317 (U.S. Pat.No. 4,861,719, herein incorporated by reference), the PsiCRIP cell line(See, WO90/02806), and the GP+envAm-12 cell line (See, WO89/07150). Inaddition, the recombinant retroviral vectors can contain modificationswithin the LTRs for suppressing transcriptional activity as well asextensive encapsidation sequences that may include a part of the gaggene (Bender et al., J. Virol., 61:1639 [1987]). Recombinant retroviralvectors are purified by standard techniques known to those havingordinary skill in the art.

Alternatively, the vector can be introduced in vivo by lipofection. Forthe past decade, there has been increasing use of liposomes forencapsulation and transfection of nucleic acids in vitro. Syntheticcationic lipids designed to limit the difficulties and dangersencountered with liposome mediated transfection can be used to prepareliposomes for in vivo transfection of a gene encoding a marker (Felgneret. al., Proc. Natl. Acad. Sci. USA 84:7413-7417 [1987]; See also,Mackey, et al., Proc. Natl. Acad. Sci. USA 85:8027-8031 [1988]; Ulmer etal., Science 259:1745-1748 [1993]). The use of cationic lipids maypromote encapsulation of negatively charged nucleic acids, and alsopromote fusion with negatively charged cell membranes (Felgner andRingold, Science 337:387-388 [1989]). Particularly useful lipidcompounds and compositions for transfer of nucleic acids are describedin WO95/18863 and WO96/17823, and in U.S. Pat. No. 5,459,127, hereinincorporated by reference.

Other molecules are also useful for facilitating transfection of anucleic acid in vivo, such as a cationic oligopeptide (e.g.,WO95/21931), peptides derived from DNA binding proteins (e.g.,WO96/25508), or a cationic polymer (e.g., WO95/21931).

It is also possible to introduce the vector in vivo as a naked DNAplasmid. Methods for formulating and administering naked DNA tomammalian muscle tissue are disclosed in U.S. Pat. Nos. 5,580,859 and5,589,466, both of which are herein incorporated by reference.

DNA vectors for gene therapy can be introduced into the desired hostcells by methods known in the art, including but not limited totransfection, electroporation, microinjection, transduction, cellfusion, DEAE dextran, calcium phosphate precipitation, use of a genegun, or use of a DNA vector transporter (See e.g., Wu et al., J. Biol.Chem., 267:963 [1992]; Wu and Wu, J. Biol. Chem., 263:14621 [1988]; andWilliams et al., Proc. Natl. Acad. Sci. USA 88:2726 [1991]).Receptor-mediated DNA delivery approaches can also be used (Curiel etal., Hum. Gene Ther., 3:147 [1992]; and Wu and Wu, J. Biol. Chem.,262:4429 [1987]).

VI. Transgenic Animals Expressing Exogenous NOD Genes and Homologs,Mutants, and Variants Thereof

The present invention contemplates the generation of transgenic animalscomprising an exogenous NOD gene or homologs, mutants, or variantsthereof. In preferred embodiments, the transgenic animal displays analtered phenotype as compared to wild-type animals. In some embodiments,the altered phenotype is the overexpression of mRNA for a NOD gene ascompared to wild-type levels of NOD expression. In other embodiments,the altered phenotype is the decreased expression of mRNA for anendogenous NOD gene as compared to wild-type levels of endogenous NODexpression. In some preferred embodiments, the transgenic animalscomprise mutant alleles of NOD. Methods for analyzing the presence orabsence of such phenotypes include Northern blotting, mRNA protectionassays, and RT-PCR. In other embodiments, the transgenic mice have aknock out mutation of a NOD gene. In preferred embodiments, thetransgenic animals display an altered susceptibility to inflammatorydiseases.

Such animals find use in research applications (e.g., identifyingsignaling pathways that a NOD protein is involved in), as well as drugscreening applications (e.g., to screen for drugs that prevent or treatinflammatory diseases. For example, in some embodiments, test compounds(e.g., a drug that is suspected of being useful to treat an inflammatorydisease are administered to the transgenic animals and control animalswith a wild type NOD allele and the effects evaluated. The effects ofthe test and control compounds on disease symptoms are then assessed.

The transgenic animals can be generated via a variety of methods. Insome embodiments, embryonal cells at various developmental stages areused to introduce transgenes for the production of transgenic animals.Different methods are used depending on the stage of development of theembryonal cell. The zygote is the best target for micro-injection. Inthe mouse, the male pronucleus reaches the size of approximately 20micrometers in diameter, which allows reproducible injection of 1-2picoliters (pl) of DNA solution. The use of zygotes as a target for genetransfer has a major advantage in that in most cases the injected DNAwill be incorporated into the host genome before the first cleavage(Brinster et al., Proc. Natl. Acad. Sci. USA 82:4438-4442 [1985]). As aconsequence, all cells of the transgenic non-human animal will carry theincorporated transgene. This will in general also be reflected in theefficient transmission of the transgene to offspring of the foundersince 50% of the germ cells will harbor the transgene. U.S. Pat. No.4,873,191 describes a method for the micro-injection of zygotes; thedisclosure of this patent is incorporated herein in its entirety.

In other embodiments, retroviral infection is used to introducetransgenes into a non-human animal. In some embodiments, the retroviralvector is utilized to transfect oocytes by injecting the retroviralvector into the perivitelline space of the oocyte (U.S. Pat. No.6,080,912, incorporated herein by reference). In other embodiments, thedeveloping non-human embryo can be cultured in vitro to the blastocyststage. During this time, the blastomeres can be targets for retroviralinfection (Janenich, Proc. Natl. Acad. Sci. USA 73:1260 [1976]).Efficient infection of the blastomeres is obtained by enzymatictreatment to remove the zona pellucida (Hogan et al., in Manipulatingthe Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y. [1986]). The viral vector system used to introduce thetransgene is typically a replication-defective retrovirus carrying thetransgene (Jahner et al., Proc. Natl. Acad Sci. USA 82:6927 [1985]).Transfection is easily and efficiently obtained by culturing theblastomeres on a monolayer of virus-producing cells (Van der Putten,supra; Stewart, et al., EMBO J., 6:383 [1987]). Alternatively, infectioncan be performed at a later stage. Virus or virus-producing cells can beinjected into the blastocoele (Jahner et al., Nature 298:623 [1982]).Most of the founders will be mosaic for the transgene sinceincorporation occurs only in a subset of cells that form the transgenicanimal. Further, the founder may contain various retroviral insertionsof the transgene at different positions in the genome that generallywill segregate in the offspring. In addition, it is also possible tointroduce transgenes into the germline, albeit with low efficiency, byintrauterine retroviral infection of the midgestation embryo (Jahner etal., supra [1982]). Additional means of using retroviruses or retroviralvectors to create transgenic animals known to the art involves themicro-injection of retroviral particles or mitomycin C-treated cellsproducing retrovirus into the perivitelline space of fertilized eggs orearly embryos (PCT International Application WO 90/08832 [1990], andHaskell and Bowen, Mol. Reprod. Dev., 40:386 [1995]).

In other embodiments, the transgene is introduced into embryonic stemcells and the transfected stem cells are utilized to form an embryo. EScells are obtained by culturing pre-implantation embryos in vitro underappropriate conditions (Evans et al., Nature 292:154 [1981]; Bradley etal., Nature 309:255 [1984]; Gossler et al., Proc. Acad. Sci. USA 83:9065[1986]; and Robertson et al., Nature 322:445 [1986]). Transgenes can beefficiently introduced into the ES cells by DNA transfection by avariety of methods known to the art including calcium phosphateco-precipitation, protoplast or spheroplast fusion, lipofection andDEAE-dextran-mediated transfection. Transgenes may also be introducedinto ES cells by retrovirus-mediated transduction or by micro-injection.Such transfected ES cells can thereafter colonize an embryo followingtheir introduction into the blastocoel of a blastocyst-stage embryo andcontribute to the germ line of the resulting chimeric animal (forreview, See, Jaenisch, Science 240:1468 [1988]). Prior to theintroduction of transfected ES cells into the blastocoel, thetransfected ES cells may be subjected to various selection protocols toenrich for ES cells which have integrated the transgene assuming thatthe transgene provides a means for such selection. Alternatively, thepolymerase chain reaction may be used to screen for ES cells that haveintegrated the transgene. This technique obviates the need for growth ofthe transfected ES cells under appropriate selective conditions prior totransfer into the blastocoel.

In still other embodiments, homologous recombination is utilized toknock-out gene function or create deletion mutants (e.g., mutants inwhich a particular domain of a NOD is deleted). Methods for homologousrecombination are described in U.S. Pat. No. 5,614,396, incorporatedherein by reference.

VIII. Drug Screening Using NOD

In some embodiments, the isolated nucleic acid and polypeptides of NODgenes of the present invention (e.g., SEQ ID NOS: 1-22) and relatedproteins and nucleic acids are used in drug screening applications forcompounds that alter (e.g., enhance or inhibit) NOD activity andsignaling. The present invention further provides methods of identifyingligands of the NOD proteins of the present invention.

As described above, NOD family proteins (e.g., Nod2) have been shown tomediate the host response to bacterial muropeptides. The presentinvention is not limited to a particular mechanism. Indeed, anunderstanding of the mechanism is not necessary to practice the presentinvention. Nonetheless, it is contemplated that the NOD family proteinsof the present invention are involved in host responses to microbes(e.g., bacteria, virus, fungi, etc.). It is further contemplated thatsome NODs recognize endogenous compounds (e.g., derived from host cells)as ligands. For example, some NODs may recognize host cell proteinsinduced by stress (e.g. heat shock proteins). Accordingly, in someembodiments, the present invention provides methods of screening forligands of NOD family proteins (e.g., ligands derived from microbes orhost factors). For example, in some embodiments, an assay that measuresNOD signaling is used to screen libraries of compounds (e.g., microbialor host derived compounds) for their ability to alter NOD familysignaling.

In other embodiments, the present invention provides methods ofscreening compounds for the ability to alter NOD signaling mediated bynatural ligands (e.g., identified using the methods described above).Such compounds find use in the treatment of disease mediated by NODfamily members (e.g., inflammatory diseases).

In one screening method, the two-hybrid system is used to screen forcompounds (e.g., proteins) capable of altering NOD function(s) (e.g.,interaction with a binding partner) in vitro or in vivo. In oneembodiment, a GAL4 binding site, linked to a reporter gene such as lacZ,is contacted in the presence and absence of a candidate compound with aGAL4 binding domain linked to a NOD fragment and a GAL4 transactivationdomain II linked to a binding partner fragment. Expression of thereporter gene is monitored and a decrease in the expression is anindication that the candidate compound inhibits the interaction of a NODwith the binding partner. Alternately, the effect of candidate compoundson the interaction of a NOD with other proteins (e.g., proteins known tointeract directly or indirectly with the binding partner) can be testedin a similar manner In some embodiments, the present invention providesmethods of identifying NOD binding partners or ligands that utilizeimmunoprecipitation. In some embodiments, antibodies to NOD proteins areutilized to immunoprecipitated NODs and any bound proteins. In otherembodiments, NOD fusion proteins are generated with tags and antibodiesto the tags are utilized for immunoprecipitation. Potential bindingpartners that immunoprecipitate with NODs can be identified using anysuitable method.

In another screening method, candidate compounds are evaluated for theirability to alter NOD signaling by contacting NOD, binding partners,binding partner-associated proteins, or fragments thereof, with thecandidate compound and determining binding of the candidate compound tothe peptide. The protein or protein fragments is/are immobilized usingmethods known in the art such as binding a GST-NOD fusion protein to apolymeric bead containing glutathione. A chimeric gene encoding a GSTfusion protein is constructed by fusing DNA encoding the polypeptide orpolypeptide fragment of interest to the DNA encoding the carboxylterminus of GST (See e.g., Smith et al., Gene 67:31 [1988]). The fusionconstruct is then transformed into a suitable expression system (e.g.,E. coli XA90) in which the expression of the GST fusion protein can beinduced with isopropyl-β-D-thiogalactopyranoside (IPTG). Induction with(IPTG should yield the fusion protein as a major constituent of soluble,cellular proteins. The fusion proteins can be purified by methods knownto those skilled in the art, including purification by glutathioneaffinity chromatography. Binding of the candidate compound to theproteins or protein fragments is correlated with the ability of thecompound to disrupt the signal transduction pathway and thus regulateNOD physiological effects (e.g., inflammatory disease).

In another screening method, one of the components of the NOD/bindingpartner signaling system is immobilized. Polypeptides can be immobilizedusing methods known in the art, such as adsorption onto a plasticmicrotiter plate or specific binding of a GST-fusion protein to apolymeric bead containing glutathione. For example, in some embodiments,GST-NOD is bound to glutathione-Sepharose beads. The immobilized peptideis then contacted with another peptide with which it is capable ofbinding in the presence and absence of a candidate compound. Unboundpeptide is then removed and the complex solubilized and analyzed todetermine the amount of bound labeled peptide. A decrease in binding isan indication that the candidate compound inhibits the interaction ofNOD with the other peptide. A variation of this method allows for thescreening of compounds that are capable of disrupting apreviously-formed protein/protein complex. For example, in someembodiments a complex comprising a NOD or a NOD fragment bound toanother peptide is immobilized as described above and contacted with acandidate compound. The dissolution of the complex by the candidatecompound correlates with the ability of the compound to disrupt orinhibit the interaction between NOD and the other peptide.

Another technique for drug screening provides high throughput screeningfor compounds having suitable binding affinity to NOD peptides and isdescribed in detail in WO 84/03564, incorporated herein by reference.Briefly, large numbers of different small peptide test compounds aresynthesized on a solid substrate, such as plastic pins or some othersurface. The peptide test compounds are then reacted with NOD peptidesand washed. Bound NOD peptides are then detected by methods well knownin the art.

Another technique uses NOD antibodies, generated as discussed above.Such antibodies are capable of specifically binding to NOD peptides andcompete with a test compound for binding to NOD. In this manner, theantibodies can be used to detect the presence of any peptide that sharesone or more antigenic determinants of a NOD peptide.

The present invention contemplates many other means of screeningcompounds. The examples provided above are presented merely toillustrate a range of techniques available. One of ordinary skill in theart will appreciate that many other screening methods can be used.

In particular, the present invention contemplates the use of cell linestransfected with NOD genes and variants thereof for screening compoundsfor activity, and in particular to high throughput screening ofcompounds from combinatorial libraries (e.g., libraries containinggreater than 10⁴ compounds). The cell lines of the present invention canbe used in a variety of screening methods. In some embodiments, thecells can be used in second messenger assays that monitor signaltransduction following activation of cell-surface receptors. In otherembodiments, the cells can be used in reporter gene assays that monitorcellular responses at the transcription/translation level. In stillfurther embodiments, the cells can be used in cell proliferation assaysto monitor the overall growth/no growth response of cells to externalstimuli.

In second messenger assays, the host cells are preferably transfected asdescribed above with vectors encoding NOD or variants or mutantsthereof. The host cells are then treated with a compound or plurality ofcompounds (e.g., from a combinatorial library) and assayed for thepresence or absence of a response. It is contemplated that at least someof the compounds in the combinatorial library can serve as agonists,antagonists, activators, or inhibitors of the protein or proteinsencoded by the vectors. It is also contemplated that at least some ofthe compounds in the combinatorial library can serve as agonists,antagonists, activators, or inhibitors of protein acting upstream ordownstream of the protein encoded by the vector in a signal transductionpathway.

In some embodiments, the second messenger assays measure fluorescentsignals from reporter molecules that respond to intracellular changes(e.g., Ca²⁺ concentration, membrane potential, pH, IP₃, cAMP,arachidonic acid release) due to stimulation of membrane receptors andion channels (e.g., ligand gated ion channels; see Denyer et al., DrugDiscov. Today 3:323 [1998]; and Gonzales et al., Drug. Discov. Today4:431-39 [1999]). Examples of reporter molecules include, but are notlimited to, FRET (florescence resonance energy transfer) systems (e.g.,Cuo-lipids and oxonols, EDAN/DABCYL), calcium sensitive indicators(e.g., Fluo-3, FURA 2, INDO 1, and FLUO3/AM, BAPTA AM),chloride-sensitive indicators (e.g., SPQ, SPA), potassium-sensitiveindicators (e.g., PBFI), sodium-sensitive indicators (e.g., SBFI), andpH sensitive indicators (e.g., BCECF).

In general, the host cells are loaded with the indicator prior toexposure to the compound. Responses of the host cells to treatment withthe compounds can be detected by methods known in the art, including,but not limited to, fluorescence microscopy, confocal microscopy (e.g.,FCS systems), flow cytometry, microfluidic devices, FLIPR systems (See,e.g., Schroeder and Neagle, J. Biomol. Screening 1:75 [1996]), andplate-reading systems. In some preferred embodiments, the response(e.g., increase in fluorescent intensity) caused by compound of unknownactivity is compared to the response generated by a known agonist andexpressed as a percentage of the maximal response of the known agonist.The maximum response caused by a known agonist is defined as a 100%response. Likewise, the maximal response recorded after addition of anagonist to a sample containing a known or test antagonist is detectablylower than the 100% response.

The cells are also useful in reporter gene assays. Reporter gene assaysinvolve the use of host cells transfected with vectors encoding anucleic acid comprising transcriptional control elements of a targetgene (i.e., a gene that controls the biological expression and functionof a disease target) spliced to a coding sequence for a reporter gene.Therefore, activation of the target gene results in activation of thereporter gene product. In some embodiments, the reporter gene constructcomprises the 5′ regulatory region (e.g., promoters and/or enhancers) ofa protein whose expression is controlled by NOD in operable associationwith a reporter gene. Examples of reporter genes finding use in thepresent invention include, but are not limited to, chloramphenicoltransferase, alkaline phosphatase, firefly and bacterial luciferases,β-galactosidase, β-lactamase, and green fluorescent protein. Theproduction of these proteins, with the exception of green fluorescentprotein, is detected through the use of chemiluminescent, calorimetric,or bioluminecent products of specific substrates (e.g., X-gal andluciferin). Comparisons between compounds of known and unknownactivities may be conducted as described above.

Specifically, the present invention provides screening methods foridentifying modulators, i.e., candidate or test compounds or agents(e.g., proteins, peptides, peptidomimetics, peptoids, small molecules orother drugs) which bind to a NOD of the present invention, have aninhibitory (or stimulatory) effect on, for example, NOD expression orNOD activity, or have a stimulatory or inhibitory effect on, forexample, the expression or activity of a NOD substrate. Compounds thusidentified can be used to modulate the activity of target gene products(e.g., NOD genes) either directly or indirectly in a therapeuticprotocol, to elaborate the biological function of the target geneproduct, or to identify compounds that disrupt normal target geneinteractions. Compounds, which stimulate the activity of a variant NODor mimic the activity of a non-functional variant are particularlyuseful in the treatment of inflammatory diseases.

In one embodiment, the invention provides assays for screening candidateor test compounds that are substrates of a NOD protein or polypeptide ora biologically active portion thereof. In another embodiment, theinvention provides assays for screening candidate or test compounds thatbind to or modulate the activity of a NOD protein or polypeptide or abiologically active portion thereof.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including biological libraries; peptoid libraries (libraries ofmolecules having the functionalities of peptides, but with a novel,non-peptide backbone, which are resistant to enzymatic degradation butwhich nevertheless remain bioactive; see, e.g., Zuckennann et al., J.Med. Chem. 37: 2678-85 [1994]); spatially addressable parallel solidphase or solution phase libraries; synthetic library methods requiringdeconvolution; the ‘one-bead one-compound’ library method; and syntheticlibrary methods using affinity chromatography selection. The biologicallibrary and peptoid library approaches are preferred for use withpeptide libraries, while the other four approaches are applicable topeptide, non-peptide oligomer or small molecule libraries of compounds(Lam (1997) Anticancer Drug Des. 12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in: DeWitt et al., Proc. Natl. Acad. Sci.U.S.A. 90:6909 [1993]; Erb et al., Proc. Nad. Acad. Sci. USA 91:11422[1994]; Zuckermann et al., J. Med. Chem. 37:2678 [1994]; Cho et al.,Science 261:1303 [1993]; Carrell et al., Angew. Chem. Int. Ed. Engl.33.2059 [1994]; Carell et al., Angew. Chem. Int. Ed. Engl. 33:2061[1994]; and Gallop et al., J. Med. Chem. 37:1233 [1994].

Libraries of compounds may be presented in solution (e.g., Houghten,Biotechniques 13:412-421 [1992]), or on beads (Lam, Nature 354:82-84[1991]), chips (Fodor, Nature 364:555-556 [1993]), bacteria or spores(U.S. Pat. No. 5,223,409; herein incorporated by reference), plasmids(Cull et al., Proc. Nad. Acad. Sci. USA 89:18651869 [1992]) or on phage(Scott and Smith, Science 249:386-390 [1990]; Devlin Science 249:404-406[1990]; Cwirla et al., Proc. NatI. Acad. Sci. 87:6378-6382 [1990];Felici, J. Mol. Biol. 222:301 [1991]).

In one embodiment, an assay is a cell-based assay in which a cell thatexpresses a NOD protein or biologically active portion thereof iscontacted with a test compound, and the ability of the test compound tomodulate a NOD's activity is determined. Determining the ability of thetest compound to modulate NOD activity can be accomplished bymonitoring, for example, changes in enzymatic activity. The cell, forexample, can be of mammalian origin.

The ability of the test compound to modulate NOD binding to a compound,e.g., a NOD substrate, can also be evaluated. This can be accomplished,for example, by coupling the compound, e.g., the substrate, with aradioisotope or enzymatic label such that binding of the compound, e.g.,the substrate, to a NOD can be determined by detecting the labeledcompound, e.g., substrate, in a complex.

Alternatively, a NOD is coupled with a radioisotope or enzymatic labelto monitor the ability of a test compound to modulate NOD binding to aNOD substrate in a complex. For example, compounds (e.g., substrates)can be labeled with ¹²⁵I, ³⁵S ¹⁴C or ³H, either directly or indirectly,and the radioisotope detected by direct counting of radioemmission or byscintillation counting. Alternatively, compounds can be enzymaticallylabeled with, for example, horseradish peroxidase, alkaline phosphatase,or luciferase, and the enzymatic label detected by determination ofconversion of an appropriate substrate to product.

The ability of a compound (e.g., a NOD substrate) to interact with a NODwith or without the labeling of any of the interactants can beevaluated. For example, a microphysiorneter can be used to detect theinteraction of a compound with a NOD without the labeling of either thecompound or the NOD (McConnell et al. Science 257:1906-1912 [1992]). Asused herein, a “microphysiometer” (e.g., Cytosensor) is an analyticalinstrument that measures the rate at which a cell acidifies itsenvironment using a light-addressable potentiometric sensor (LAPS).Changes in this acidification rate can be used as an indicator of theinteraction between a compound and a NOD polypeptide.

In yet another embodiment, a cell-free assay is provided in which a NODprotein or biologically active portion thereof is contacted with a testcompound and the ability of the test compound to bind to the NOD proteinor biologically active portion thereof is evaluated. Preferredbiologically active portions of NOD proteins to be used in assays of thepresent invention include fragments that participate in interactionswith substrates or other proteins, e.g., fragments with high surfaceprobability scores.

Cell-free assays involve preparing a reaction mixture of the target geneprotein and the test compound under conditions and for a time sufficientto allow the two components to interact and bind, thus forming a complexthat can be removed and/or detected.

The interaction between two molecules can also be detected, e.g., usingfluorescence energy transfer (FRET) (see, for example, Lakowicz et al.,U.S. Pat. No. 5,631,169; Stavrianopoulos et al., U.S. Pat. No.4,968,103; each of which is herein incorporated by reference). Afluorophore label is selected such that a first donor molecule's emittedfluorescent energy will be absorbed by a fluorescent label on a second,‘acceptor’ molecule, which in turn is able to fluoresce due to theabsorbed energy.

Alternately, the ‘donor’ protein molecule may simply utilize the naturalfluorescent energy of tryptophan residues. Labels are chosen that emitdifferent wavelengths of light, such that the ‘acceptor’ molecule labelmay be differentiated from that of the ‘donor’. Since the efficiency ofenergy transfer between the labels is related to the distance separatingthe molecules, the spatial relationship between the molecules can beassessed. In a situation in which binding occurs between the molecules,the fluorescent emission of the ‘acceptor’ molecule label in 1 5 theassay should be maximal. An FRET binding event can be convenientlymeasured through standard fluorometric detection means well known in theart (e.g., using a fluorimeter).

In another embodiment, determining the ability of a NOD protein to bindto a target molecule can be accomplished using real-time BiomolecularInteraction Analysis (BIA) (see, e.g., Sjolander and Urbaniczky, Anal.Chem. 63:2338-2345 [1991] and Szabo et al. Curr. Opin. Struct. Biol.5:699-705 [1995]). “Surface plasmon resonance” or “BIA” detectsbiospecific interactions in real time, without labeling any of theinteractants (e.g., BlAcore). Changes in the mass at the binding surface(indicative of a binding event) result in alterations of the refractiveindex of light near the surface (the optical phenomenon of surfaceplasmon resonance (SPR)), resulting in a detectable signal that can beused as an indication of real-time reactions between biologicalmolecules.

In one embodiment, the target gene product or the test substance isanchored onto a solid phase. The target gene product/test compoundcomplexes anchored on the solid phase can be detected at the end of thereaction. Preferably, the target gene product can be anchored onto asolid surface, and the test compound, (which is not anchored), can belabeled, either directly or indirectly, with detectable labels discussedherein.

It may be desirable to immobilize a NOD protein, an anti-NOD antibody orits target molecule to facilitate separation of complexed fromnon-complexed forms of one or both of the proteins, as well as toaccommodate automation of the assay. Binding of a test compound to a NODprotein, or interaction of a NOD protein with a target molecule in thepresence and absence of a candidate compound, can be accomplished in anyvessel suitable for containing the reactants. Examples of such vesselsinclude microtiter plates, test tubes, and micro-centrifuge tubes. Inone embodiment, a fusion protein can be provided which adds a domainthat allows one or both of the proteins to be bound to a matrix. Forexample, glutathione-S-transferase-NOD fusion proteins orglutathione-S-transferase/target fusion proteins can be adsorbed ontoglutathione Sepharose beads (Sigma Chemical, St. Louis, Mo.) orglutathione-derivatized microtiter plates, which are then combined withthe test compound or the test compound and either the non-adsorbedtarget protein or NOD protein, and the mixture incubated underconditions conducive for complex formation (e.g., at physiologicalconditions for salt and pH). Following incubation, the beads ormicrotiter plate wells are washed to remove any unbound components, thematrix immobilized in the case of beads, complex determined eitherdirectly or indirectly, for example, as described above.

Alternatively, the complexes can be dissociated from the matrix, and thelevel of NOD binding or activity determined using standard techniques.Other techniques for immobilizing either a NOD protein or a targetmolecule on matrices include using conjugation of biotin andstreptavidin. Biotinylated NOD protein or target molecules can beprepared from biotin-NHS (N-hydroxy-succinimide) using techniques knownin the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.),and immobilized in the wells of streptavidin-coated 96 well plates(Pierce Chemical).

In order to conduct the assay, the non-immobilized component is added tothe coated surface containing the anchored component. After the reactionis complete, unreacted components are removed (e.g., by washing) underconditions such that any complexes formed will remain immobilized on thesolid surface. The detection of complexes anchored on the solid surfacecan be accomplished in a number of ways. Where the previouslynon-immobilized component is pre-labeled, the detection of labelimmobilized on the surface indicates that complexes were formed. Wherethe previously non-immobilized component is not pre-labeled, an indirectlabel can be used to detect complexes anchored on the surface; e.g.,using a labeled antibody specific for the immobilized component (theantibody, in turn, can be directly labeled or indirectly labeled with,e.g., a labeled anti-IgG antibody).

This assay is performed utilizing antibodies reactive with NOD proteinor target molecules but which do not interfere with binding of the NODprotein to its target molecule. Such antibodies can be derivatized tothe wells of the plate, and unbound target or NOD protein trapped in thewells by antibody conjugation. Methods for detecting such complexes, inaddition to those described above for the GST-immobilized complexes,include immunodetection of complexes using antibodies reactive with theNOD protein or target molecule, as well as enzyme-linked assays whichrely on detecting an enzymatic activity associated with the NOD proteinor target molecule.

Alternatively, cell free assays can be conducted in a liquid phase. Insuch an assay, the reaction products are separated from unreactedcomponents, by any of a number of standard techniques, including, butnot limited to: differential centrifugation (see, for example, Rivas andMinton, Trends Biochem Sci 18:284-7 [1993]); chromatography (gelfiltration chromatography, ion-exchange chromatography); electrophoresis(see, e.g., Ausubel et al., eds. Current Protocols in Molecular Biology1999, J. Wiley: New York.); and immunoprecipitation (see, for example,Ausubel et al., eds. Current Protocols in Molecular Biology 1999, J.Wiley: New York). Such resins and chromatographic techniques are knownto one skilled in the art (See e.g., Heegaard J. Mol. Recognit 11: 141-8[1998]; Hageand Tweed J. Chromatogr. Biomed. Sci. Appl 699:499-525[1997]). Further, fluorescence energy transfer may also be convenientlyutilized, as described herein, to detect binding without furtherpurification of the complex from solution.

The assay can include contacting the NOD protein or biologically activeportion thereof with a known compound that binds the NOD to form anassay mixture, contacting the assay mixture with a test compound, anddetermining the ability of the test compound to interact with a NODprotein, wherein determining the ability of the test compound tointeract with a NOD protein includes determining the ability of the testcompound to preferentially bind to NOD or biologically active portionthereof, or to modulate the activity of a target molecule, as comparedto the known compound.

To the extent that a NOD can, in vivo, interact with one or morecellular or extracellular macromolecules, such as proteins, inhibitorsof such an interaction are useful. A homogeneous assay can be used canbe used to identify inhibitors.

For example, a preformed complex of the target gene product and theinteractive cellular or extracellular binding partner product isprepared such that either the target gene products or their bindingpartners are labeled, but the signal generated by the label is quencheddue to complex formation (see, e.g., U.S. Pat. No. 4,109,496, hereinincorporated by reference, that utilizes this approach forimmunoassays). The addition of a test substance that competes with anddisplaces one of the species from the preformed complex will result inthe generation of a signal above background. In this way, testsubstances that disrupt target gene product-binding partner interactioncan be identified. Alternatively, a NOD protein can be used as a “baitprotein” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S.Pat. No. 5,283,317; Zervos et al., Cell 72:223-232 [1993]; Madura etal., J. Biol. Chem. 268.12046-12054 [1993]; Bartel et al., Biotechniques14:920-924 [1993]; Iwabuchi et al., Oncogene 8:1693-1696 [1993]; andBrent WO 94/10300; each of which is herein incorporated by reference),to identify other proteins, that bind to or interact with a NOD(“NOD-binding proteins” or “NOD-bp”) and are involved in NOD activity.Such NOD-bps can be activators or inhibitors of signals by the NODproteins or targets as, for example, downstream elements of aNOD-mediated signaling pathway.

Modulators of NOD expression can also be identified. For example, a cellor cell free mixture is contacted with a candidate compound and theexpression of a NOD mRNA or protein evaluated relative to the level ofexpression of the NOD mRNA or protein in the absence of the candidatecompound. When expression of the NOD mRNA or protein is greater in thepresence of the candidate compound than in its absence, the candidatecompound is identified as a stimulator of a NOD mRNA or proteinexpression. Alternatively, when expression of NOD mRNA or protein isless (i.e., statistically significantly less) in the presence of thecandidate compound than in its absence, the candidate compound isidentified as an inhibitor of NOD mRNA or protein expression. The levelof NOD mRNA or protein expression can be determined by methods describedherein for detecting NOD mRNA or protein.

A modulating agent can be identified using a cell-based or a cell freeassay, and the ability of the agent to modulate the activity of a NODprotein can be confirmed in vivo, e.g., in an animal such as an animalmodel for a disease (e.g., an animal with inflammatory disease).

B. Therapeutic Agents

This invention further pertains to novel agents identified by theabove-described screening assays. Accordingly, it is within the scope ofthis invention to further use an agent identified as described herein(e.g., a NOD modulating agent or mimetic, a NOD specific antibody, or aNOD-binding partner) in an appropriate animal model (such as thosedescribed herein) to determine the efficacy, toxicity, side effects, ormechanism of action, of treatment with such an agent. Furthermore, asdescribed above, novel agents identified by the above-describedscreening assays can be, e.g., used for treatments of inflammatorydisease (e.g., including, but not limited to, psoriasis or Crohn'sdisease). In some embodiments, the agents are NOD ligands or ligandanalogs (e.g., identified using the drug screening methods describedabove).

IX. Pharmaceutical Compositions Containing NOD Nucleic Acid, Peptides,and Analogs

The present invention further provides pharmaceutical compositions whichmay comprise all or portions of NOD polynucleotide sequences, NODpolypeptides, inhibitors or antagonists of NOD bioactivity, includingantibodies, alone or in combination with at least one other agent, suchas a stabilizing compound, and may be administered in any sterile,biocompatible pharmaceutical carrier, including, but not limited to,saline, buffered saline, dextrose, and water.

The methods of the present invention find use in treating diseases oraltering physiological states characterized by mutant NOD alleles (e.g.,inflammatory disease). Peptides can be administered to the patientintravenously in a pharmaceutically acceptable carrier such asphysiological saline. Standard methods for intracellular delivery ofpeptides can be used (e.g., delivery via liposome). Such methods arewell known to those of ordinary skill in the art. The formulations ofthis invention are useful for parenteral administration, such asintravenous, subcutaneous, intramuscular, and intraperitoneal.Therapeutic administration of a polypeptide intracellularly can also beaccomplished using gene therapy as described above.

As is well known in the medical arts, dosages for any one patientdepends upon many factors, including the patient's size, body surfacearea, age, the particular compound to be administered, sex, time androute of administration, general health, and interaction with otherdrugs being concurrently administered.

Accordingly, in some embodiments of the present invention, NODnucleotide and NOD amino acid sequences can be administered to a patientalone, or in combination with other nucleotide sequences, drugs orhormones or in pharmaceutical compositions where it is mixed withexcipient(s) or other pharmaceutically acceptable carriers. In oneembodiment of the present invention, the pharmaceutically acceptablecarrier is pharmaceutically inert. In another embodiment of the presentinvention, NOD polynucleotide sequences or NOD amino acid sequences maybe administered alone to individuals subject to or suffering from adisease.

Depending on the condition being treated, these pharmaceuticalcompositions may be formulated and administered systemically or locally.Techniques for formulation and administration may be found in the latestedition of “Remington's Pharmaceutical Sciences” (Mack Publishing Co,Easton Pa.). Suitable routes may, for example, include oral ortransmucosal administration; as well as parenteral delivery, includingintramuscular, subcutaneous, intramedullary, intrathecal,intraventricular, intravenous, intraperitoneal, or intranasaladministration.

For injection, the pharmaceutical compositions of the invention may beformulated in aqueous solutions, preferably in physiologicallycompatible buffers such as Hanks' solution, Ringer's solution, orphysiologically buffered saline. For tissue or cellular administration,penetrants appropriate to the particular barrier to be permeated areused in the formulation. Such penetrants are generally known in the art.

In other embodiments, the pharmaceutical compositions of the presentinvention can be formulated using pharmaceutically acceptable carrierswell known in the art in dosages suitable for oral administration. Suchcarriers enable the pharmaceutical compositions to be formulated astablets, pills, capsules, liquids, gels, syrups, slurries, suspensionsand the like, for oral or nasal ingestion by a patient to be treated.

Pharmaceutical compositions suitable for use in the present inventioninclude compositions wherein the active ingredients are contained in aneffective amount to achieve the intended purpose. For example, aneffective amount of NOD may be that amount that suppresses apoptosis.Determination of effective amounts is well within the capability ofthose skilled in the art, especially in light of the disclosure providedherein.

In addition to the active ingredients these pharmaceutical compositionsmay contain suitable pharmaceutically acceptable carriers comprisingexcipients and auxiliaries that facilitate processing of the activecompounds into preparations that can be used pharmaceutically. Thepreparations formulated for oral administration may be in the form oftablets, dragees, capsules, or solutions.

The pharmaceutical compositions of the present invention may bemanufactured in a manner that is itself known (e.g., by means ofconventional mixing, dissolving, granulating, dragee-making, levigating,emulsifying, encapsulating, entrapping or lyophilizing processes).

Pharmaceutical formulations for parenteral administration includeaqueous solutions of the active compounds in water-soluble form.Additionally, suspensions of the active compounds may be prepared asappropriate oily injection suspensions. Suitable lipophilic solvents orvehicles include fatty oils such as sesame oil, or synthetic fatty acidesters, such as ethyl oleate or triglycerides, or liposomes. Aqueousinjection suspensions may contain substances that increase the viscosityof the suspension, such as sodium carboxymethyl cellulose, sorbitol, ordextran. Optionally, the suspension may also contain suitablestabilizers or agents that increase the solubility of the compounds toallow for the preparation of highly concentrated solutions.

Pharmaceutical preparations for oral use can be obtained by combiningthe active compounds with solid excipient, optionally grinding aresulting mixture, and processing the mixture of granules, after addingsuitable auxiliaries, if desired, to obtain tablets or dragee cores.Suitable excipients are carbohydrate or protein fillers such as sugars,including lactose, sucrose, mannitol, or sorbitol; starch from corn,wheat, rice, potato, etc; cellulose such as methyl cellulose,hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; andgums including arabic and tragacanth; and proteins such as gelatin andcollagen. If desired, disintegrating or solubilizing agents may beadded, such as the cross-linked polyvinyl pyrrolidone, agar, alginicacid or a salt thereof such as sodium alginate.

Dragee cores are provided with suitable coatings such as concentratedsugar solutions, which may also contain gum arabic, talc,polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/or titaniumdioxide, lacquer solutions, and suitable organic solvents or solventmixtures. Dyestuffs or pigments may be added to the tablets or drageecoatings for product identification or to characterize the quantity ofactive compound, (i.e., dosage).

Pharmaceutical preparations that can be used orally include push-fitcapsules made of gelatin, as well as soft, sealed capsules made ofgelatin and a coating such as glycerol or sorbitol. The push-fitcapsules can contain the active ingredients mixed with a filler orbinders such as lactose or starches, lubricants such as talc ormagnesium stearate, and, optionally, stabilizers. In soft capsules, theactive compounds may be dissolved or suspended in suitable liquids, suchas fatty oils, liquid paraffin, or liquid polyethylene glycol with orwithout stabilizers.

Compositions comprising a compound of the invention formulated in apharmaceutical acceptable carrier may be prepared, placed in anappropriate container, and labeled for treatment of an indicatedcondition. For polynucleotide or amino acid sequences of NOD, conditionsindicated on the label may include treatment of condition related toinflammatory diseases.

The pharmaceutical composition may be provided as a salt and can beformed with many acids, including but not limited to hydrochloric,sulfuric, acetic, lactic, tartaric, malic, succinic, etc. Salts tend tobe more soluble in aqueous or other protonic solvents that are thecorresponding free base forms. In other cases, the preferred preparationmay be a lyophilized powder in 1 mM-50 mM histidine, 0.1%-2% sucrose,2%-7% mannitol at a pH range of 4.5 to 5.5 that is combined with bufferprior to use.

For any compound used in the method of the invention, thetherapeutically effective dose can be estimated initially from cellculture assays. Then, preferably, dosage can be formulated in animalmodels (particularly murine models) to achieve a desirable circulatingconcentration range that adjusts NOD levels.

A therapeutically effective dose refers to that amount of NOD thatameliorates symptoms of the disease state. Toxicity and therapeuticefficacy of such compounds can be determined by standard pharmaceuticalprocedures in cell cultures or experimental animals, e.g., fordetermining the LD₅₀ (the dose lethal to 50% of the population) and theED₅₀ (the dose therapeutically effective in 50% of the population). Thedose ratio between toxic and therapeutic effects is the therapeuticindex, and it can be expressed as the ratio LD₅₀/ED₅₀. Compounds thatexhibit large therapeutic indices are preferred. The data obtained fromthese cell culture assays and additional animal studies can be used informulating a range of dosage for human use. The dosage of suchcompounds lies preferably within a range of circulating concentrationsthat include the ED₅₀ with little or no toxicity. The dosage varieswithin this range depending upon the dosage form employed, sensitivityof the patient, and the route of administration.

The exact dosage is chosen by the individual physician in view of thepatient to be treated. Dosage and administration are adjusted to providesufficient levels of the active moiety or to maintain the desiredeffect. Additional factors which may be taken into account include theseverity of the disease state; age, weight, and gender of the patient;diet, time and frequency of administration, drug combination(s),reaction sensitivities, and tolerance/response to therapy. Long actingpharmaceutical compositions might be administered every 3 to 4 days,every week, or once every two weeks depending on half-life and clearancerate of the particular formulation.

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to atotal dose of about 1 g, depending upon the route of administration.Guidance as to particular dosages and methods of delivery is provided inthe literature (See, U.S. Pat. Nos. 4,657,760; 5,206,344; or 5,225,212,all of which are herein incorporated by reference). Those skilled in theart will employ different formulations for NOD than for the inhibitorsof NOD. Administration to the bone marrow may necessitate delivery in amanner different from intravenous injections.

All publications and patents mentioned in the above specification areherein incorporated by reference. Various modifications and variationsof the described method and system of the invention will be apparent tothose skilled in the art without departing from the scope and spirit ofthe invention. Although the invention has been described in connectionwith specific preferred embodiments, it should be understood that theinvention as claimed should not be unduly limited to such specificembodiments. Indeed, various modifications of the described modes forcarrying out the invention that are obvious to those skilled in therelevant fields are intended to be within the scope of the followingclaims.

1. A composition comprising an isolated and purified nucleic acidsequence encoding a protein selected from the group consisting of SEQ IDNOs: 12-21.
 2. The composition of claim 1, wherein said sequence isoperably linked to a heterologous promoter.
 3. The composition of claim1, wherein said sequence is contained within a vector.
 4. Thecomposition of claim 3, wherein said vector is within a host cell. 5.The composition of claim 1, wherein said nucleic acid is selected fromthe group consisting of SEQ ID NOs: 1-10 nd variants thereof that are atleast 80% identical to SEQ ID NOs: 1-10.
 6. The composition of claim 5,wherein said protein is at least 90% identical to SEQ ID NOs: 12-21. 7.The composition of claim 5, wherein said protein is at least 95%identical to SEQ ID NOs: 12-21.
 8. The composition of claim 1, whereinsaid nucleic acid sequence is selected from the group consisting of SEQID NOs: 1-10. 9-17. (canceled)